In this talk, I review some current speech enhancement technologies based on higher-order statistics. First, blind speech extraction (BSE) and its recent extension are described. BSE is an approach taken to estimate original speech signal using only observed signals without knowing a priori information. The basic principle of BSE based on noise estimation by independent component analysis is explained, and its recent extensions combining ICA and nonlinear noise reduction are shown with the demonstration of our recently developed real-time BSE hardware. Next, for a mitigation of an artifact known as "musical noise" arising in nonlinear signal processing like BSE, I introduce a new mathematical metric of musical noise generation based on higher-order statistics pursuit. This is motivated by new findings that the amount of musical noise is highly correlated with a change of the 4th-order statistics of the signal. I show some applications of this metric successfully used in speech enhancement study.
Speaker: Hiroshi Saruwatari (Nara Institute of Science and Technology, Japan)
When and where?
Tuesday, Sep 21, 2010, Bochum (2.00 pm, ID 04/401, Fakultät für Elektrotechnik und Informationstechnik, Ruhr-Universität Bochum)