The standard “behavior from observation” method has two bottlenecks, i.e., difference of embodiment and a manual generating of behaviors from an interaction corpus.
In order to resolve the difference of embodiment, a WOZ (Wizard of OZ) method is introduced to observe interaction between human participants and a robot controlled by a hidden human operator (WOZ). In order for the WOZ approach to be successful, we need to overcome difficulties for manipulating a robot with many degrees of freedom.
An immersive WOZ environment (ICIE) allows the human operator to control a robot as if s/he stayed inside it. The audio-visual environment surrounding the WOZ-operated robot is captured, e.g., by an omnidirectional camera attached to the robot’s head, and is sent to the WOZ operator’s cockpit to be projected on the surrounding immersive screen and the speakers The current version of ICIE employs eight 64-inch display panels arranged in a circle with about 2.5 meters diameter. Eight surround speakers are used to reproduce the acoustic environment. Altogether, the immersive environment allows the WOZ operator in the center of the cockpit to grasp in detail the situation around the robot to determine exactly what to do if s/he were the robot.
The WOZ operator’s behavior, in turn, is captured in real time by a collection of range sensors. Noise filters and human body model are used for robust recognition of pose, head direction and gesture. The captured motion is mapped on the robot for motion generation. The sound on each side of the WOZ operator is gathered by microphones and communicated via network so that other participants in the conversation place can hear the voice of the WOZ operator (with a modulation, when necessary).
A “learning by mimicking” is introduce to realize automatic generating of behaviors from an interaction corpus. The behavioral model of the robot is generated from the collected data in four stages in the framework of learning by mimicking. First, the basic actions and commands are discovered on the discovery stage. A number of novel algorithms have been developed. RSST (Robust Singular Spectrum Transform) is an algorithm that calculates likelihood of change of dynamics in continuous time series without prior knowledge. DGCMD (Distance-Graph Constrained Motif Discovery) uses the result of RSST to discover motifs (recurring temporal patterns) from the given time series. Second, a probabilistic model is generated to specify the likelihood of the occurrence of observed actions as a result of observed commands on the associa-tion stage. Granger causality is used to discover natural delay. Third, the behavioral model is converted into an actual controller on the controller generation stage to allow the robotic agent to act in similar situations. Finally, the gestures and actions learned from multiple interactions are combined into a single model on the accumula-tion stage. The above algorithms are presented as an extension to conventional methods.
Table of Contents
1. Immersive interaction environment [Ohmoto 2011]
2. Learning by Mimicking [Mohammad 2009 PhDThesis]
3. Motif Discovery [Chiu 2003][Pevzner 2000][Buhler 2001][Catalano 2003][Mohammad 2009]
4. Change Point Discovery [Ide 2005][Mohammad 2009]
5. Controller generation and accumulation [Mohammad 2010]
PPT
is available from here (access limited) (uploaded at 23:25 November 1st)
References
In order to resolve the difference of embodiment, a WOZ (Wizard of OZ) method is introduced to observe interaction between human participants and a robot controlled by a hidden human operator (WOZ). In order for the WOZ approach to be successful, we need to overcome difficulties for manipulating a robot with many degrees of freedom.
An immersive WOZ environment (ICIE) allows the human operator to control a robot as if s/he stayed inside it. The audio-visual environment surrounding the WOZ-operated robot is captured, e.g., by an omnidirectional camera attached to the robot’s head, and is sent to the WOZ operator’s cockpit to be projected on the surrounding immersive screen and the speakers The current version of ICIE employs eight 64-inch display panels arranged in a circle with about 2.5 meters diameter. Eight surround speakers are used to reproduce the acoustic environment. Altogether, the immersive environment allows the WOZ operator in the center of the cockpit to grasp in detail the situation around the robot to determine exactly what to do if s/he were the robot.
The WOZ operator’s behavior, in turn, is captured in real time by a collection of range sensors. Noise filters and human body model are used for robust recognition of pose, head direction and gesture. The captured motion is mapped on the robot for motion generation. The sound on each side of the WOZ operator is gathered by microphones and communicated via network so that other participants in the conversation place can hear the voice of the WOZ operator (with a modulation, when necessary).
A “learning by mimicking” is introduce to realize automatic generating of behaviors from an interaction corpus. The behavioral model of the robot is generated from the collected data in four stages in the framework of learning by mimicking. First, the basic actions and commands are discovered on the discovery stage. A number of novel algorithms have been developed. RSST (Robust Singular Spectrum Transform) is an algorithm that calculates likelihood of change of dynamics in continuous time series without prior knowledge. DGCMD (Distance-Graph Constrained Motif Discovery) uses the result of RSST to discover motifs (recurring temporal patterns) from the given time series. Second, a probabilistic model is generated to specify the likelihood of the occurrence of observed actions as a result of observed commands on the associa-tion stage. Granger causality is used to discover natural delay. Third, the behavioral model is converted into an actual controller on the controller generation stage to allow the robotic agent to act in similar situations. Finally, the gestures and actions learned from multiple interactions are combined into a single model on the accumula-tion stage. The above algorithms are presented as an extension to conventional methods.
Table of Contents
1. Immersive interaction environment [Ohmoto 2011]
2. Learning by Mimicking [Mohammad 2009 PhDThesis]
3. Motif Discovery [Chiu 2003][Pevzner 2000][Buhler 2001][Catalano 2003][Mohammad 2009]
4. Change Point Discovery [Ide 2005][Mohammad 2009]
5. Controller generation and accumulation [Mohammad 2010]
PPT
is available from here (access limited) (uploaded at 23:25 November 1st)
References
- [Buhler 2001] J. Buhler and M. Tompa. Finding motifs using random projections. In 5th International Conference on Computational Biology, pages 69–76, 2001.
- [Catalano 2006] Joe Catalano, Tom Armstrong, and Tim Oates. Discovering patterns in real-valued time series. In Knowledge Discovery in Databases: PKDD 2006, pages 462–469, 2006.
- [Chiu 2003] B. Chiu, E. Keogh, and S. Lonardi, “Probabilistic discovery of time series motifs,” in KDD ’03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. New York, NY, USA: ACM, 2003, pp. 493–498.
- [Furaoa 2006] Shen Furaoa, Osamu Hasegawa. An incremental network for on-line unsupervised classification and topology learning, Neural Networks 19 (2006) 90–106[Ide 2005] T. Ide and K. Inoue, “Knowledge discovery from heterogeneous dynamic systems using change-point correlations,” in Proc. SIAM Intl. Conf. Data Mining, 2005.
- [Kendon 2004] Kendon, A.: Gesture, Cambridge University Press, 2004
- [Mohammad 2009] Yasser Mohammad, Toyoaki Nishida, Shogo Okada, Unsupervised Simultaneous Learning of Gestures, Actions and their Associations for Human-Robot Interaction,” Intelligent Robots and Systems, 2009. IROS 2009. IEEE/RSJ International Conference on , pp.2537-2544, 11-15 Oct. 2009
- [Mohammad 2009 PhDThesis] Yasser Mohammad, Autonomous Development of Natural Interactive Behavior for Robots and Embodied Agents, PhD Thesis, Kyoto University, September 2009
- [Mohammad 2010] Yasser Mohammad, Toyoaki Nishida, Learning Interaction Protocols using Augmented Baysian Networks Applied to Guided Navigation, Taipei, Taiwan, IROS 2010.
- [Ohmoto 2011] Ohmoto, Y., Ohashi, H., Lala, D., Mori, S., Sakamoto, K., Kinoshita, K. & Nishida, T.: ICIE: immersiveenvironment for social interaction based on socio-spacial information. The 2011 Conference on Technologies and Applications of Artificial Intelligence (TAAI 2011), 2011
- [Okada 2009] Shogo Okada and Toyoaki Nishida. Incremental clustering of gesture patterns based on a self organizing incremental neural network, in Proceedings of International Joint Conference on Neural Networks, Atlanta, Georgia, USA, June 14-19, pp. 2316-2322, 2009.
- [Pevzner 2000] Pevzner, P. A. & Sze, S. H. (2000). Combinatorial approaches to finding subtle signals in DNA sequences. In proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology. La Jolla, CA, Aug 19-23. pp 269-278.
- [Xu 2009] Yong Xu, Kazuhiro Ueda, Takanori Komatsu, Takeshi Okadome, Takashi Hattori, Yasuyuki Sumi and Toyoaki Nishida, WOZ Experiments for Understanding Mutual Adaptation, AI&Society, Vol. 23, No. 2, Page 201-212, 2009.