Audio samples from paper "Neural Multi-Channel and Multi-Microphone Acoustic Echo Cancellation"

Authors: Chenggang Zhang, Jinjiang Liu, Hao Li and Xueliang Zhang

Multi-Channel Multi-Microphone (MCMM) AEC results at SER = -5 dB.

Simulation room scenario

Signals	Sample 1	Sample 2	Sample 3	Sample 4
Reference microphone signal


Target signal


Yang [1]


Cheng et al. [2]


Zhang [3]


ICRN-S


ICRN

Near-end signal is speech, while far-end signal is music

Signals	Sample 1	Sample 2	Sample 3	Sample 4
Reference microphone signal


Target signal


Yang [1]


Cheng et al. [2]


Zhang [3]


ICRN-S


ICRN

Near-end signal is music, while far-end signal is speech

Signals	Sample 1	Sample 2	Sample 3	Sample 4
Reference microphone signal


Target signal


Yang [1]


Cheng et al. [2]


Zhang [3]


ICRN-S


ICRN

Real record stereo signal using a laptop equipped with two loudspeakers and two microphones.
This recording includes single-talk (the reference signal is a stereo song), double-talk (four speeches by the near-end speaker), and the echo path is changed scenarios.

Signals	Sample
Reference microphone signal


ICRN-S estimated signal

References

[1]Feiran Yang, Ming Wu, and Jun Yang. "Stereophonic acoustic echo suppression based on Wiener filter in the short-time fourier transform domain." IEEE Signal Processing Letters 19.4 (2012): 227-230.
[2]Linjuan Cheng, et al. "Deep learning-based stereophonic acoustic echo suppression without decorrelation." The Journal of the Acoustical Society of America 150.2 (2021): 816-829.
[3]Hao Zhang, and DeLiang Wang. "A Deep Learning Approach to Multi-Channel and Multi-Microphone Acoustic Echo Cancellation." Proc. Interspeech 2021 (2021): 1139-1143.

Multi-Channel Multi-Microphone (MCMM) AEC results at SER = -5 dB.

Simulation room scenario

Near-end signal is speech, while far-end signal is music

Near-end signal is music, while far-end signal is speech

Real record stereo signal using a laptop equipped with two loudspeakers and two microphones. This recording includes single-talk (the reference signal is a stereo song), double-talk (four speeches by the near-end speaker), and the echo path is changed scenarios.

References

Real record stereo signal using a laptop equipped with two loudspeakers and two microphones.
This recording includes single-talk (the reference signal is a stereo song), double-talk (four speeches by the near-end speaker), and the echo path is changed scenarios.