Convolved Common Audio Signal Extraction

These sound examples correspond to the submitted article with the above title.

The examples are selections out of all multichannel mixtures. The goal is to separate the music and effects (mfx) contribution from the dialog contribution in movie soundtracks.
This problem can be seen as a common signal extraction task, where the common signal is the mfx contribution.
Our previous method based on geometric common signal extraction gave very conclusive results under the hypothesis that the music and effects tracks are exactly the same in the different versions.
The method proposed in the present article addresses a more realistic case, by handling mfx tracks which differ in equalization (or filter).
We present here the results described in the article, for the convolved case only. The different mfx tracks for each version have been individually filtered before being mixed with the corresponding dialog tracks.
We present the results for N=3 and N=5 versions. Note that the N-SP-SUB method only estimates one mfx track.

N=5 channels

Original mfx1 mfx2 mfx3 mfx4 mfx5 dialog1 dialog2 dialog3 dialog4 dialog5
Mixed - - - - - Mix 1 Mix 2 Mix 3 Mix 4 Mix 5
Results:
N-SP-SUB common mfx dialog1 dialog2 dialog3 dialog4 dialog5
CCNMF mfx1 mfx2 mfx3 mfx4 mfx5 dialog1 dialog2 dialog3 dialog4 dialog5

N=3 channels

Original mfx1 mfx2 mfx3 dialog1 dialog2 dialog3
Mixed - - - Mix 1 Mix 2 Mix 3
Results:
N-SP-SUB common mfx dialog1 dialog2 dialog3
CCNMF mfx1 mfx2 mfx3 dialog1 dialog2 dialog3

Leave a Reply