Multimodal human-robot interaction

IMG_0636

Description

To accomplish the Human-Robot Interaction problem we think that we have to resolve the following issues:

Meaning Generation


The robot has to be able to understand context, i.e. object and human detection and identification. The skill of giving meaning to the environment objects will make important progress in the robot interaction with them.

The essence of this problem is the formulation process: how to represent the meaning of something. It is a knowledge representation problem. But this problem has been treated for along human history. Philosophers, psychologists and other scientifics. A lot of different and interesitng approaches have emerged in the last years, but the application of these ideas to robotics is not a banal issue.

For interaction with the human, we are developing a model of the user that pretend to include: his mental models, emotions, beliefes, desires and intentions.

Human-Human Interaction


In this field we are going to use the subdivision established by Morris in “Foundations in the Theory of Signs”, that subdivide the Human Communication in three areas:


  1. Syntax. It studies the theory of information: codification, channels, capacity, noise, redundancy and other probabilistic language properties.

  2. Semantic. The Meaning is the central goal of the Semantic. In the Communication process emitter and receiver have to agree in the meaning of a message.

  3. Pragmatic. Pragmatic domain studies the effects of the communication in the behaviour of both the emitter and the receiver.


This brief schema established the frame of our Human Communication research. The model that we are developing will be implemented by means of the Automatic-Deliberated Architecture. This architecture, that has been created by Ramón Barber, is an hybrid control architecture. The above Syntax Level coincides with the low level architecture or Automatic Level. The Semantic Level would coincide with the high level of the architecture: Deliberated Level

Human-Robot Interaction


The mean goal of the above issues is to make a model that can be implemented in a computer system. We want to give to the user the sensation that is interacting with the Personal Robot, efficiently. We have two different ways to solve the problem: the inner approach and the outer approach. In the inner one, we are interested in developing a Human-Human Interaction model in the robot, and then adjust the model in the pragmatic level to make the interaction dynamic works correctly. In the outer approach, we are more interested in developed a model that satisfies the interaction dynamic directly. This model doesn’t have to be a human-based model.

By interaction dynamic we understand, the process along time where the robot is doing things to the user and is detecting things that the user does. This dynamic process has a special time parameters like for example silence time, time of a question, waiting times, etc, special movements like blinking, consent movements, etc, and special user gestures and movements detection like “user is speaking”, “user is very close”, etc.

Human-robot interaction is defined as the study of
humans, robots, and the ways they influence each other.
This interaction can be social if the robots are able to interact
with human as partners if not peers. In this case, there is a
need to provide humans and robots with models of each
other. Sheridan argues that the ideal would be analogous to
two people who know each other well and who can pick up
subtle cues from one another (e.g., musician playing a duet).

A social robot has attitudes or behaviours that take the
interests, intentions or needs of the humans into account.
Bartneck and Forlizzi define a social robot as “an
autonomous or semiautonomous robot that interacts and
communicates with humans by following the behavioral
norms expected by the people with whom the robot is
intended to interact?. The term sociable robot has been
coined by Breazeal in order to distinguish an
anthropomorphic style of human-robot interaction from
insect-inspired interaction behaviours. In this context, sociable
robots can be considered as a distinct subclass of social
robots. She defines sociable robots as socially participative
creatures with their own internal goals and motivations.

Multimodality


Multimodality allows humans to move seamlessly between
different modes of interaction, from visual to voice to touch,
according to changes in context or user preference. A social
robot must provide multimodal interfaces, which try to
integrate speech, written text, body language, gestures, eye
or lip movements and other forms of communication in order
to better understand the human and to communicate more
effectively and naturally.

We can enumerate the different modalities in HRI in two types: perception or expression modes. The different modes work in a separate way, that is, they do not communicate each other directly. To make a global synchronization between them an upper entity is used, that is called Communication Act Skill. Our multimodality model for robot interaction is based on these modes:


  • Visual: gesture expression and recognition.

  • Tactile: tactile sensor and tactile screen perception.

  • Voice: text-to-speech and automatic-speech-recognition.

  • Audiovisual: sound and visual expression

  • Remote: web-2.0 interaction.

Visual Interactive Mode: Gesture Expression Model


The Visual Mode includes all visible expressive acts. Traditionally, it is divided in kinesics: body gestures, and proxemics: body placing in the communication system. We differentiate as a special interactive mode, the audiovisual mode, that is explained later. It has been established the importance of body movements in the communication act because it contains a lot of information that flows very quickly. Birdwhistell argues that the 65% of the information in a human-human interaction is non-verbal. Visual gestures shows human thoughts, mood state, replaies, complements, accents and adjust verbal information. Several problems arise when we want to make a human gestures model that could be implemented in a robot. We differ two directions: gesture expression model and gesture recognition. At the moment only the former is being taken into account.

A discrete set of different atomic gestures has been implemented. An atomic-gesture duration is lower than approximately five seconds. Each atomic-gesture can be interrupted in real-time by another atomic-gesture to configure the final dynamic expression.
Attending to the whole life of a gesture, they are divided in acquired and non-acquired or innate gestures. So, when the robot begins to be active it counts with a set of non-acquired gestures that can be or not be kept along its life. But the robot also can learn more gestures from the user.
Attending to the gesture dynamics, we differ gestures that have or not have a final ending, and also gestures that should or should not start from a necessary initial position.
Each atom-gesture has an intensity and velocity parameter that modulate it.
Attending to the way that each gesture can be interpreted we consider:


  • Emblems: that replace words and sentences.
  • Instructs: that reinforce verbal messages.
  • Affective gestures: that show emotions and express affect.
  • Adjusting or control gestures: that regulate the flux and way of communication. They are one of the more culturally determined gestures.
  • Adaptors: release emotional and physical tension. They are in the low level awareness.

Tactile Mode


Two different kind of tactile modes can be differentiate: tactile skin sensing, and tactile screen sensing. The former is analogue to human skin sensing. The latter is exclusive for robotics. Depending on the hardware the robot can detect that something is touching it, where and get information about the force. The tactile screen gives the robot the possibility to perceive ink-gesture data introduced by the user. As the tactile screen is also showing an image, the ink-gesture data has to be interpreted in contrast with that image. The ability of showing an image by means of a tactile screen is explained latter in the audiovisual interactive mode.

Voice Mode


This mode is in charge of verbal human-robot communication.

Verbal Perception.


The verbal signal can be interpreted for speech recognition, but it also gives user prosodic and user localization information.
Our automatic speech recognition model is based on a dynamic asr-grammars system. It works in a asr-engine. The set of active grammars can be changed in real-time. The set of asr-grammars is made a-priori attending of what information is useful for the robot. Each grammar is related to a Speech Act, so the speech recognition works as a speech act trigger.
No ontological information is consider, at the moment.

Verbal Expression.


The speech system is based on two types of sentences: fixed sentences and variable sentences. The former is designed a-priori, and they are sentence related to constant episodes that always occur in a common conversation.
The variable sentence are made using a fixed grammar. When the speech skill decides to use a variable sentence, it first chooses a grammar with slots. Then, the grammar holes are completed using the appropriate words for the context.

Audiovisual Mode


A personal robot incorporates and works by means one or more computers. So the range of possible communication ways can be extended from human communication emulation to other possibilities that a computer offers, for example electronic sound synthesis. The sound mode can be used in:

  • Mood, affect or emotion expression associated with long term states: happy/sad or angry/calm

  • Interjection expression associated with short term states: fright, scare, laughter, crying, etc.
  • Notice sounds, to get the user attention and notice some interaction prompts, etc.
  • Singing skill using synthesized instruments.
  • Sound imitation: siren sounds, dog barks and other nature sounds, …

We are studying the sound in music in its communicational side, extracting a set of parameters for sound synthesis and the relation of these parameters with the kind of message or intention that the robot wants to communicate. We are implementing a Sound Synthesis System that takes internal robot state parameters as inputs and synthesizes sounds for expression. This internal parameters include but not limited to emotional state, emotional magnitude or mood energy.

Audiovisual mode refers to the expression of synchronized images, video and sound, music, or voice. Moreover, audiovisual expressions (sound, video and computer generated graphics) can also be triggered and provided to the user as feedback or respond to robots initiatives.

Remote Mode


This is the most robot specific interactive mode. As the core part of a robot is its computer, the robot is also able to use all the capabilities that the computer offers. And one of the most important thing that a computer can do is to connect to internet and access to remote information. In the other side, internet is growing so much, that net protocols are changing to more computer centered protocols. In this sense web-2.0 or so called semantic-web offers inter-computer communication as never has existed. In this way, internet works as a big sensor for the robot, that can access to weather reports, news, e-mail, bus timetables, etc. The robot can also receive remote orders from a remote user, and interact with a remote user using chat skill, video-conference, etc.

Important HRI researching groups


HRI events


Entries:
Signage system for the navigation of autonomous robots in indoor environments
IEEE Transactions on Industrial Informatics. num. 1 , vol. 10 , pages: 680 – 688 , 2014
A. Corrales M. Malfaz M.A. Salichs
Fast 3D Cluster-tracking for a Mobile Robot using 2D Techniques on Depth Images
Cybernetics and Systems: An International Journal. num. 4 , vol. 44 , pages: 325 – 350 , 2013
A. Ramey M. Malfaz M.A. Salichs
Multimodal Fusion as Communicative Acts during Human-Robot Interaction
Cybernetics and Systems: An International Journal. num. 8 , vol. 44 , pages: 681 – 703 , 2013
F. Alonso Javi F. Gorostiza M. Malfaz M.A. Salichs
Integration of a voice recognition system in a social robot
Cybernetics and Systems: An International Journal (Online). num. 4 , vol. 42 , pages: 215 – 245 , 2011
F. Alonso M.A. Salichs
Maggie: A Social Robot as a Gaming Platform
International Journal of Social Robotics. num. 4 , vol. 3 , pages: 371 – 381 , 2011
A. Ramey V. Gonzalez Pacheco F. Alonso A. Castro-Gonzalez M.A. Salichs
End-User Programming of a Social Robot by Dialog
Robotics and Autonomous Systems. (Online). num. 12 , vol. 59 , pages: 1102 – 1114 , 2011
Javi F. Gorostiza M.A. Salichs
Usability assessment of ASIBOT: a portable robot to aid patients with spinal cord injury
Disability & Rehabilitation: Assistive Technology. , pages: 1 – 11 , 2010
A. Jardon C.A. Monje A. Gil A. Peña
Human-Robot Interfaces for Social Interaction
International Journal of Robotics and Automation. , 2006
A.M. Khamis M.A. Salichs
Human-Robot Interfaces for Social Interaction
International Journal of Robotics and Automation. num. 3 , vol. 22 , pages: 215 – 221 , 2007
A.M. Khamis M.A. Salichs

Entries:
Multidomain Voice Activity Detection during Human-Robot Interaction.
International Conference on Social Robotics (ICSR 2013). , 2013, Bristol, UK
F. Alonso A. Castro-Gonzalez Javi F. Gorostiza M.A. Salichs
Diseño Preliminar de Interfaces de Realidad Aumentada para el Robot Asistencial ASIBOT
V Congreso Internacional de Diseño, Redes de Investigación y Tecnología para todos (DRT4ALL), 2013, MADRID, Spain
F. Rodriguez Juan G. Victores A. Jardon
Facial gesture recognition and postural interaction using neural evolution algorithm and active appearance models
Robocity2030 9th Workshop. Robots colaborativos e interacción humano-robot, 2011, Madrid, Spain
J.G. Bueno M. González-Fierro L. Moreno
Methodologies for Experimental Evaluation of Assistive Robotics HRI
ROBOCITY2030 9TH WORKSHOP: ROBOTS COLABORATIVOS E INTERACCION HUMANO-ROBOT, 2011, Madrid, Spain
M.F. Stoelen A. Jardon V. Tejada Juan G. Victores S. Martinez F. Bonsignorio
An information-theoretic approach to modeling and quantifying assistive robotics HRI
Late Breaking Report, Proceedings of the 6th international conference on Human-robot interaction (HRI), Lausanne, Switzerland
M.F. Stoelen F. Bonsignorio A. Jardon
Information Metrics for Assistive Human-In-The-Loop Cognitive Systems
Workshop on Good Experimental Methodology in Robotics and Replicable Robotics Research, Robotics Science and Systems (RSS), 2010, Zaragoza, Spain
M.F. Stoelen A. Jardon Juan G. Victores F. Bonsignorio
Towards an Enabling Multimodal Interface for an Assistive Robot
Workshop on Mutimodal Human-Robot Interfaces, IEEE InternationalConference on Robotics and Automation (ICRA), 2010, Anchorage, AK, USA
M.F. Stoelen A. Jardon F. Bonsignorio Juan G. Victores C.A. Monje
Teaching Sequences to a Social Robot by Voice Interaction
RO-MAN 09 : 18th IEEE International Symposium on Robot and Human Interactive Communication , 2009, Toyama, Japan
Javi F. Gorostiza M.A. Salichs
Dispositivo inalámbrico para facilitar el acceso al ordenador.
Congreso Internacional sobre Domótica, Robótica y Teleasistencia para Todos DRT4LL 2009, 2009, Barcelona, SPAIN
S. Martinez A. Jardon
Assistive robots dependability in domestic environment: the ASIBOT kitchen test bed
IARP-IEEE/RAS-EURON Joint Workshop on Shared Control for Robotic Ultra-operations, San Diego, California, Oct 28-30, 2007, 2007, San Diego, CA, EEUU
A. Gimenez S. Martinez A. Jardon
Multimodal Human-Robot Interaction Framework for a Personal Robot
RO-MAN 06: The 15th IEEE International Symposium on Robot and Human Interactive Communication, 2006, Hatfield, United Kingdom
E. Delgado A. Corrales R. Rivas R. Pacheco A.M. Khamis Javi F. Gorostiza M. Malfaz R. Barber M.A. Salichs
Maggie: A Robotic Platform for Human-Robot Social Interaction
IEEE International Conference on Robotics, Automation and Mechatronics (RAM 2006), 2006, Bangkok, Thailand
E. Delgado A. Corrales R. Rivas R. Pacheco A.M. Khamis Javi F. Gorostiza M. Malfaz R. Barber M.A. Salichs
Active human-mobile manipulator cooperation through intention recognition
IEEE International Conference on Robotics and Automation (ICRA'01), 2001, Seoul, Korea
D. Blanco M.A. Salichs
Active Human-Mobile Manipulator Cooperation Through Intention Recognition
IEEE International Conference on Robotics and Automation, 2001, Seoul, Korea
D. Blanco C. Balaguer M.A. Salichs

Entries:
ROBOT2013: First Iberian Robotics Conference, Advances in Robotics, Vol.1, Part III
chapter: Assistive Robot Multi-modal Interaction with Augmented 3D Vision and Dialogue pages: 209 – 217. Springer International Publishing Madrid (Spain) , ISBN: 9783319034126, 2014
Juan G. Victores F. Rodriguez S. Morante A. Jardon
Design and Control of Intelligent Robotic Systems
chapter: Path planning inspired on emotional intelligence pages: 119 – 132. Springer-Verlag Berlin Heidelberg , ISBN: 978-3-540-89932, 2009
V. Egido M. Malfaz R. Barber M.A. Salichs
Progress in Robotics. Communications in Computer and Information Science 44
chapter: Infrared Remote Control with a Social Robot pages: 86 – 95. Springer , ISBN: 978-3-642-03985, 2009
A. Castro-Gonzalez M.A. Salichs
Arquitecturas de Control para Robots
chapter: Arquitectura software de un robot personal pages: 101 – 115. Universidad Polit¶ecnica de Madrid , ISBN: 978-84-7484-196, 2007
A. Corrales R. Rivas R. Barber M.A. Salichs

This site is registered on wpml.org as a development site. Switch to a production site key to remove this banner.