IoT and Embedded Devices

Promising applications of robot hearing technology are IoT and embedded devices. For embedded devices, we were conducting research in collaboration with HRI-JP, aiming at applications to ICT and mobility. For example, HRI-JP is the developer of a tablet device with a microphone array and a car navigation system with a multi-person interaction function using an ultra-thin processor board from System In Frontier, Inc. In addition, there is a start-up company (Hyable, Inc.) that provides a discussion analysis service using robot hearing technology as a cloud service.

 

In our laboratory, we are currently working on two main topics. One is the calibration of microphones or microphone arrays when they are used as IoT devices, which is a kind of research on sensor networks. In general, when multiple microphones or microphone arrays are integrated, information such as the position and orientation of each microphone array is required. This research topic to calibrate such information automatically without manual measurement.

Another is a sound processing framework for MEC (Multi-access Edge Computing) with the spread of 5G in mind. Because MEC enables a part of processing in the cloud services to be executed near IoT devices, a processing latency can be expected to be reduced.However, whether each process is executed in the IoT device, the MEC device, or the cloud depends largely on the application. Therefore, we are aiming to build a framework that allows us to freely select devices that execute various functions of robot audition according to the application, which provides high degrees of freedom in system design.

 

Flexible sound processing IoT framework for multi-access edge computing

Demonstration of a tablet with a microphone array (with HRI-JP)
The ultra-thin processor is embedded in the tablet case with 8 microphones, it can estimate sound source direction and separate sound sources for better automatic speech recognition by the help of HARK. This can be used as a tool that supports hearing impaired people and multilingual communication.

 

Demonstration of an in-vehicle information system (with HRI-JP)

A general car navigation system with voice recognition functions always requires the user to press the button, or wake-up-words before he or she commands. When the button is pressed or the wake-up-word is uttered, the music is muted, the air conditioner volume is reduced, and a beeping sound is heard. If the user starts speaking before this beep, the speech recognition function will not work properly. However, with HARK, speech recognition is robust even under noisy music and high air volume. In addition to enable the system to be button-less, it can distinguish and recognize speeches from both driver’s seat and passenger’s seat.

Publications

Back to Top