Sound Synthesis for Communicating Nonverbal Expressive Cues
IEEE Access

Non-verbal sounds (NVS) constitute an
appealing communicative channel for transmitting a
message during a dialog. They provide two main benefits,
such as they are not linked to any particular language,
and they can express a message in a short time. NVS have
been successfully used in robotics, cell phones, and
science fiction films. However, there is a lack of deep
studies on how to model NVS. For instance, most of the
systems for NVS expression are ad hoc solutions that
focus on the communication of the most prominent
emotion. Only a small number of papers have proposed a
more general model or dealt directly with the expression
of pure communicative acts, such as affirmation, denial,
or greeting. In this paper we propose a system, referred
to as the sonic expression system (SES), that is able to
generate NVS on the fly by adapting the sound to the
context of the interaction. The system is designed to be
used by social robots while conducting human–robot
interactions. It is based on a model that includes
several acoustic features from the amplitude, frequency,
and time spaces. In order to evaluate the capabilities
of the system, nine categories of communicative acts
were created. By means of an online questionnaire, 51
participants classified the utterances according to
their meaning, such as agreement, hesitation, denial,
hush, question, summon, encouragement, greetings, and
laughing. The results showed how very different NVS
generated by our SES can be used for communicating.