Home /Research /Speaker localization among multi-faces in noisy environment by audio-visual integration

HRI

Speaker localization among multi-faces in noisy environment by audio-visual integration

Hyun-Don Kim, JongSuk Choi, Munsang Kim

Year: 2006
Citations: 14

Abstract

In this paper, we not only developed a reliable sound localization system including VAD (voice activity detection) component using three microphones but also a face tracking system using a vision camera. Moreover, we proposed a way to integrate these systems in the human-robot interaction to compensate the errors in the localization of a speaker and to reject unnecessary speech or noise signals entering from the undesired directions effectively. For the purpose of verifying our system's performances, we installed the proposed audition and vision system to the prototype robot, called IROBAA (Intelligent ROBot for Active Audition), and showed how to integrate an audio-visual system

Keywords

Computer scienceComputer visionArtificial intelligenceRobotComponent (thermodynamics)Face (sociological concept)Noise (video)Acoustic source localizationAudio visualSpeech recognition

Speaker localization among multi-faces in noisy environment by audio-visual integration

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory