Home /Research /Speaker localization among multi-faces in noisy environment by audio-visual integration
HRI

Speaker localization among multi-faces in noisy environment by audio-visual integration

Hyun-Don Kim, JongSuk Choi, Munsang Kim

Year
2006
Citations
14

Abstract

In this paper, we not only developed a reliable sound localization system including VAD (voice activity detection) component using three microphones but also a face tracking system using a vision camera. Moreover, we proposed a way to integrate these systems in the human-robot interaction to compensate the errors in the localization of a speaker and to reject unnecessary speech or noise signals entering from the undesired directions effectively. For the purpose of verifying our system's performances, we installed the proposed audition and vision system to the prototype robot, called IROBAA (Intelligent ROBot for Active Audition), and showed how to integrate an audio-visual system

Keywords

Computer scienceComputer visionArtificial intelligenceRobotComponent (thermodynamics)Face (sociological concept)Noise (video)Acoustic source localizationAudio visualSpeech recognition

Related papers

Browse all HRI papers