首页 /研究 /Probabilistic integration of audiovisual information to localize sound source in human-robot interaction
HRI

Probabilistic integration of audiovisual information to localize sound source in human-robot interaction

B. Chen, M. Meguro, M. Kaneko

发表年份
2004
引用次数
7

摘要

This paper proposes a method to estimate a sound source position by fusing the auditory and visual information with Bayesian network in human-robot interaction. We firstly integrate multi-channel audio signals and a depth image about the environment to generate a likelihood map for sound source localization. However, this integration, denoted by "MICs", does not always lead to locate a sound source correctly. For correcting the failure in localization, we integrate the likelihood values generated from "MICs" and the skin-color distribution in an image according to the result of classifying audio signal into speech/non-speech categories. The audio classifier is based on the support vector machine(SVM) and the skin-color distribution is modeled with GMM. With the evidences given by MICs, SVMs and GMM, we infer whether pixels in images correspond to sound source or not according to the trained Bayesian network. Finally, experimental results are presented to show the effectiveness of the proposed method.

关键词

Computer scienceArtificial intelligenceAcoustic source localizationProbabilistic logicSupport vector machineAudio signalSpeech recognitionFormantPattern recognition (psychology)Computer vision

相关论文

查看 HRI 分类全部论文