Speaker identification using three signal voice domains during human-robot interaction

Download file: sigproc-sp
Download: BibTeX | Plain Text

Description

This LBR describes a novel method for user recognition in HRI, based on analyzing the peculiarities of users voices, and specially focused at being used in a robotic system. The method is inspired by acoustic fingerprinting techniques, and is made of two phases: a)enrollment in the system: the fea- tures of the user’s voice are stored in files called voiceprints, b)searching phase: the features extracted in real time are compared with the voiceprints using a pattern matching method to obtain the most likely user (match).

The audio samples are described thanks to features in three different signal domains: time, frequency, and time-frequency. Using the combination of these three domains has enabled significant increases in the accuracy of user identification compared to existing techniques. Several tests using an in- dependent user voice database show that only half a second of user voice is enough to identify the speaker. The recogni- tion is text-independent: users do not need to say a specific sentence (key-pass) to get identified for the robot.

This site is registered on wpml.org as a development site. Switch to a production site key to remove this banner.