Sound Source Separation using a microphone array installed around the top of the Taxai's head
Automatic Speech Recognition for a Mixture of 11 People's Simultaneous Utterances by HRI-JP's HEARBO
We are also studying new practical methods for separating sound sources from various aspects.
spot beamforming
Sound source separation based on microphone array processing uses sound direction information as a clue for separation. This means that multiple sound sources in the same direction are difficult to be separated. We are constructing a method to solve this by combining multiple microphone arrays. This technique also has an advantage of not requiring accurate synchronization of multiple microphone arrays.
surface sound source separation
Most acoustic signal processing assumes a point sound source in the model, and this is also true of microphone array processing. For example, in an outdoor concert, when we want to separate only the music from the surrounding noise, or vice versa, a surface sound source is considered in the model instead of a point sound source. To solve this problem, we propose a method for separating surface sound sources by combining multiple point sound source beam formers with a low computation.
Unified framework of sound source localization, separation, and classification
In robot audition and auditory scene analysis, processes such as localization, separation, and classification are commonly integrated in a cascade manner, but such integration has a problem that errors in each process accumulate and eventually the performance of the total system deteriorates. In order to solve this problem, we propose a method that integrates these functions in an end-to-end way with deep learning.
user friendly informed source separation
In the separation of music acoustic signals, information other than acoustic signals such as musical scores can be used, which has been reported in the literature. However, such information is often expensive to create because it must be prepared manually. We propose a method to improve the separation performance simply by inputting the pronunciation timing for a part of notes.