Submitted at ICASSP 2011 conference
This paper concerns the adaptation of spectrum dictionaries in audio source separation with supervised learning. Supposing that samples of the audio sources to separate are available, a filter adaptation in the frequency domain is proposed in the context of Non-Negative Matrix Factorization with the Itakura-Saito divergence. The algorithm is able to retrieve the acoustical filter applied to the sources with a good accuracy, and demonstrates significantly higher performances on separation tasks when compared with the non-adaptive model.
Experiments and Results
We choose two different classes of instruments to test our approach: two polyphonic instruments, piano and guitar, and one monophonic instrument, bass (Bass can be polyphonic but we only address its monophonic usage here.)
The tracks come from real multi-track recordings, so the instruments are expected to play in synchrony and in harmony. The training signal for each source is built from samples of the RWC database [1], and consists of a concatenation of all the whole range of notes of one single instrument per source.
The test data is taken from a commercial recording from which the separated tracks have been made available.
We generate a two-source 0 dB mono mixture which contains the source to separate and another available source (drums).
This leads to three different tests :
- Piano test. Source 1 : piano, source 2: drums.
- Guitar test. Source 1 : guitar, source 2 : drums.
- Bass test. Source 1 : bass, source 2 : drums.
We propose to compare the separation result whether or not filter activation is performed. These examples are the ones with higher source to distorsion ratio (SDR), corresponding to table 2 in the article.
| Original Mixture |
Extracted source w/ filter adaptation |
Extracted source w/o filter adaptation |
|
| Piano Test | mix | with filter | without filter |
| Guitar Test | mix | with filter | without filter |
| Bass Test | mix | with filter | without filter |
[1] M. Goto, H. Hashiguchi, T. Nishimura, and R. Oka, “RWC muisc database : Music genre database and musical instrument sound database”, in Proc. International Conference on Music Information Retrieval (ISMIR), Baltimore, USA, 2003.