Speaker identification using three signal voice domains during human-robot interaction [Online]
ACM/IEEE International conference on Human-robot interaction (HRI 2014)
New York/EEUU
2014-03-03

This LBR describes a novel method for user recognition in HRI, based on analyzing the peculiarities of users voices, and specially focused at being used in a robotic system. The method is inspired by acoustic fingerprinting techniques, and is made of two phases: a)enrollment in the system: the fea- tures of the user’s voice are stored in files called voiceprints, b)searching phase: the features extracted in real time are compared with the voiceprints using a pattern matching method to obtain the most likely user (match).

The audio samples are described thanks to features in three different signal domains: time, frequency, and time-frequency. Using the combination of these three domains has enabled significant increases in the accuracy of user identification compared to existing techniques. Several tests using an in- dependent user voice database show that only half a second of user voice is enough to identify the speaker. The recogni- tion is text-independent: users do not need to say a specific sentence (key-pass) to get identified for the robot.

CONGRESS BOOK
Proceeding HRI '14 Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction
ISBN978-1-4503-2658-2
EditorialACM New York, NY, USA
First page114
Last page115
Year2014