Vous êtes ici : Accueil » Kiosque » Annonce


Mot de passe : 

Mot de passe oublié ?
Détails d'identification oubliés ?


11 novembre 2018

Internship M2-6 mois - Optimal signal representations for music source separation

Catégorie : Stagiaire

Optimal signal representations for music source separation

Team / Laboratory: SIMOB - IBISC (EA 4526) - Univ. Paris-Saclay (UEVE)
Contact: dominique.fourer@univ-evry.fr
Salary an perspectives: According to background and experience (a minimum of 577.50 euros/month). Possibility to pursue with a 3-year-funded PhD contract with international research partners.

Abstract: The success of arti ficial intelligence (AI) generally depends on data representation of the input. This is probably because di fferent representations can disentangle or hide the useful information exploited for a given task. Here, we propose to target the source separation problem which aims at recovering the original signals (or sources) which compose an observed mixture. State-of-the-art audio methods empirically compute a time-frequency representation of the mixture. However, experiments presented in [3] show that is
not often the best choice to efficiently segregate the sources present in the mixture. Hence, this internship focuses on optimizing the signal transformation to obtain the best separation results. The main motivation is to improve the results of a state-of-the-art source separation method and to better understand how to compute optimal data representations for a given task.

Goals: In the field of music processing, the sources often correspond to the di fferent instrumental parts (i.e. voice, guitar, piano, etc.) used to create the mix. Obtaining the source is of interest for many tasks such as creating new artistic mixes, improving the audio quality or applying e ffects such as karaoke. Starting with a state-of-the-art system that estimates the sources from an input mixture, this internship investigates the role of the input data representation on the resulting source separation quality.

Required pro file:


[1] Yoshua Bengio, Aaron Courville, and Pascal Vincent. Representation learning : A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8) pp. 1798-1828, 2013.

[2] Rachel M Bittner, Brian McFee, Justin Salamon, Peter Li, and Juan Pablo Bello. Deep salience representations for f0 estimation in polyphonic music. In Proc. ISMIR, pp 63-70, 2017.

[3] Dominique Fourer and Geo roy Peeters. Fast and adaptive blind audio source separation using recursive levenberg-marquardt synchrosqueezing. In Proc. IEEE ICASSP, Calgary, Canada, April 2018.

[4] Aaron Van Den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew W Senior, and Koray Kavukcuoglu. Wavenet : A generative model for raw audio. In SSW, 2016.

[5] E. Vincent, R. Gribonval, and C. Fevotte. Performance measurement in blind audio source separation. EEE/ACM Transactions on Audio, Speech, and Language Processing, 14(4) pp. 1462-1469, July 2006.

[6] O. Yilmaz and S. Rickard. Blind separation of speech mixtures via time-frequency masking. IEEE Transactions on Signal Processing, 52(7), pp. 1830-1847, 2004.


Dans cette rubrique

(c) GdR 720 ISIS - CNRS - 2011-2018.