Qualitative Interaction

Next: Alternate Applications Up: Evaluation Previous: Quantitative Prediction

Qualitative Interaction

In addition, real-time online testing of the system's interaction abilities was performed. A human player performed the gestures of user A and checked for the system's response. Whenever the user performed one of the gestures in Table 9.1, the system responded with a (qualitatively) appropriate animation of the synthetic character (the gesture of the missing user B). By symmetry, the roles could be reversed such that the system impersonates user A and a human acts as user B. However, the role of user A is more active and user B has a more reactive or passive role. Therefore, it seems more interesting for a human player to be in charge and to trigger the system into performing reactions.

**Figure 9.2:** Scare Interaction
$\begin{figure}\center \begin{tabular}{c} \epsfxsize=6.5in \epsfbox{scareMovie.ps} \end{tabular}\end{figure}$

In Figure 9.2, a sample interaction where the user 'scares' the system is depicted. Approximately 500ms elapse between each frame and the frames are arranged lexicographically in temporal order. The user begins in a relaxed rest state and the synthetic character as well. Then, the user begins performing a menacing gesture, raising both arms in the air and then lowering them. The synthetic character responds by first being taken aback and then crouching down in momentary fear. This is the behaviour that was indicated by examples from the human-to-human training. Moreover, the responses from the system contain some default pseudo random variations giving them a more compelling nature. The character can sometimes get fidgety and will shake a bit or flail before entering the crouching position.

**Figure 9.3:** Waving Interaction
$\begin{figure}\center \begin{tabular}{c} \epsfxsize=6.5in \epsfbox{waveMovie.ps} \end{tabular}\end{figure}$

Figure 9.3, contains a gesture where the system has to respond to a greeting wave by the user. The user begins in a relaxed rest state and this induces that same rest state in the synthetic character. Then, almost immediately when the user begins waving, the system also waves back. Both perform the periodic gesture for a few frames. However, when the user stops waving, the system decays its wave and stabilizes back to rest state. In fact, this is the desired behaviour since user B was typically in a rest state whenever user A did no particular motion. Also note that the synthetic character begins waving immediately when the user starts extending his arm sideways. This indicates that the gesture is easily discriminated just from this initial motion. Unlike the previous gesture in Figure 9.2, the waving involves more of a gesture-to-gesture mapping. The scaring motion is merely a single beat motion and does not involves any periodicity or oscillations which are difficult to represent and to synthesize stably by feedback interaction.

**Figure 9.4:** Clapping Interaction
$\begin{figure}\center \begin{tabular}{c} \epsfxsize=6.5in \epsfbox{clapMovie.ps} \end{tabular}\end{figure}$

A more involved form of interaction is depicted in Figure 9.4. Here, the user stimulates the character by circling his stomach while patting his head. The system's reaction is to clap enthusiastically when the user accomplishes this slightly tricky and playful gesture. Once again, the system stops gesturing when the user is still (as is the case at the beginning and at the end of the sequence here). The oscillatory gesture the user is performing is rather different from the system's response. Thus, there is a higher-level mapping: oscillatory gesture to different oscillatory gesture.

In the above examples, the user is delegating tasks to the animated character since it is not a simple 1-to-1 mapping between the current measurements to output. The user produces a complex action and the system responds with a complex reaction. The response depends on user input as well as on the system's previous internal state. The mapping thus associates measurements over time to measurements over time which is fundamentally a higher dimensional problem.

Also, note that the system acquired these behaviours from a minimal amount of data. Only 5 minutes of training and a handful of exemplars of each gesture were given to the ARL system. This is impressive considering that the gestures and interactions where not segmented or ordered. Useful associations had to be teased out of a continuous stream of interactions automatically. Larger training sets were not explored at this stage due to time constraints. However, given these initial results, a larger data set is an important next step. The learning time should also increase however the ARL system might yield more interesting results and more complex learning. In addition, the CEM algorithm could be sped up considerably, thus encouraging the use of more data.

Next: Alternate Applications Up: Evaluation Previous: Quantitative Prediction

Tony Jebara
1999-09-15