Subject: Optimal transport for deep learning and deep learning for optimal transport
Supervision: Nicolas Courty and Rémi Flamary
Locations: Vannes and Nice, France
PhD position to be filled in September/October 2018
Key-words: machine learning, optimal transport, deep learning
The Wasserstein distance is a powerful tool based on the theory of optimal transport to compare data distributions with wide applications in image processing, computer vision and machine learning . In a context of machine learning, it has recently found numerous applications, e.g. domain adaptation , or word embedding . In the context of deep learning, the Wasserstein appeared recently to be a powerful loss in generative models  and in multi-label classification . Its power comes from two major reasons: i) it allows to operate on empirical data distributions in a non-parametric way ii) the geometry of the underlying space can be leveraged to compare the distributions in a geometrically sound way. Yet, the deployment of Wasserstein distances in a wider class of applications is somehow limited, especially because of an heavy computational burden. Recent strategies implying entropic regularization  also fail to consider large scale datasets. Remarkably, the problem is amenable to stochastic programming thanks to its dual (and potentially regularized) formulation [7, 8]. Those recent advances pave the way for a large number of applications in learning with with deep networks, as soon as the Wasserstein distance serves in a loss function.
Scientific objectives and expected achievements The objective of the PhD will be to contribute in this direction, by examing at the same time the two following directions:
In the end the contributions of the PhD students will target cutting edges researches in machine learning, and the expected outcomes will be possibly published in top tier machine learning conferences and journals. From an application point of view, a particular attention will be given on remote sensing and astronomical imaging datasets, on which the two teams are specialized.
OATMIL project/supervision The phd will occur in the context of the ANR OATMIL project (http://people.irisa.fr/Nicolas.Courty/OATMIL/), that provides the funding for this research. As such, the candidate will have to develop strong interactions with the other participants and contribute to its success. The supervision will be done by Nicolas Courty1 and Rémi Flamary2
The research will take place in the context of a collaboration between IRISA and laboratoire Lagrange.
The PhD will take place both in Vannes, a beautiful medieval city of medium size close to the sea (2h30 in train from Paris), and Nice. The final repartition of the time periods spent in the two locations will be discussed and decided with the student along the PhD.
Technical aspects The applied part of the PhD will lead to development in Python. The candidate will build upon the python toolbox for optimal transport (POT: https://github.com/rflamary/POT) developed by members of the team among others. He/she will benefit from the expertise of the other members of the team, as well as ongoing collaborations with other academic partners on this subject.
Candidate profile and application Applicants are expected to be graduated in computer science and/or machine learning and/or signal & image processing and/or applied mathematics/statistics, and show an excellent academic profile. Beyond, good programming skills are expected. To apply, send a resume, along with grades obtained during the last two years and possibly recommendation letters, to Nicolas Courty (firstname.lastname@example.org) and Rémi Flamary (email@example.com
 G. Peyré and M. Cuturi, Computational Optimal Transport. To be published in Foundations and Trends in Computer Science, 2018. [Online]. Available: https://optimaltransport.github.io
 N. Courty, R. Flamary, D. Tuia, and A. Rakotomamonjy, “Optimal transport for domain adaptation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
 G. Huang, C. Guo, M. Kusner, Y. Sun, F. Sha, and K. Weinberger, “Supervised word mover’s distance,” in Advances in Neural Information Processing Systems, 2016, pp. 4862–4870.
 M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein generative adversarial networks,” in Proceedings of the 34th International Conference on Machine Learning, vol. 70, Sydney, Australia, 06–11 Aug 2017, pp. 214–223.
 C. Frogner, C. Zhang, H. Mobahi, M. Araya, and T. Poggio, “Learning with a Wasserstein loss,” in NIPS, 2015.
 M. Cuturi, “Sinkhorn distances: Lightspeed computation of optimal transportation,” in Advances on Neural Information Processing Systems (NIPS), 2013, pp. 2292–2300.
 A. Genevay, M. Cuturi, G. Peyré, and F. Bach, “Stochastic optimization for large-scale optimal transport,” in Advances in Neural Information Processing Systems, 2016, pp. 3432–3440.
 V. Seguy, B. Bhushan Damodaran, R. Flamary, N. Courty, A. Rolet, and M. Blondel, “Large-scale optimal transport and mapping estimation,” in International Conference on Learning Representations (ICLR), 2018.
 N. Courty, R. Flamary, and M. Ducoffe, “Learning wasserstein embeddings,” in International Conference on LearningRepresentations (ICLR), 2018.
(c) GdR 720 ISIS - CNRS - 2011-2018.