Vous êtes ici : Accueil » Kiosque » Annonce

Identification

Identifiant: 
Mot de passe : 

Mot de passe oublié ?
Détails d'identification oubliés ?

Annonce

11 décembre 2019

Master 2 internship - Deep Reinforcement Learning for Autonomous Vehicle Control


Catégorie : Stagiaire


Date de démarrage souhaitée: février/mars 2020

Lieu du stage: laboratoire IBISC, Université d'Evry/Paris-Saclay

Contact: Dominique Fourer (dominique.fourer@univ-evry.fr) and Lydie Nouveliere (lydie.nouveliere@univ-evry.fr)

Salaire: Négociable en fonction du cursus et de l'expérience (minimum 577,50 euros/mois)

Pièces obligatoires pour candidater: CV, lettre de motivation et notes de M1 et M2 si disponibles.

sujet au format pdf

 

Abstract: Deep learning [3] provides a new class of biological inspired methods (artificial neural networks) which can outperform the earlier state-of-the-art techniques as shown by the Imagenet challenge [2] currently dominated by CNN-based methods. From another hand, reinforcement learning enables the supremacy of algorithms over Humans in complicated games (chess of go games) with DeepBlue and Alphago and more recently Starcraft with Alphastar. Deep reinforcement learning [5] offers a promising framework allowing to build strong models using a positive loop where the algorithm can experiment and learn to optimize its decision from its errors. Following this idea, our goal here is to develop a deep reinforcement learning paradigm to control autonomous vehicles based on the information provided by its sensors.

Goals:
— Bibliographical work for identifying the best state-of-the-art deep reinforcement learning solutions. A comparative study is expected to discuss advantages, limitations and perspectives
of each investigated architecture.
— Implementation of the selected approach (possibly new) in simulated and real-world scenarios.

Methodology:
Reinforcement learning is a strong AI paradigm which requires a realistic training environment allowing to train an algorithm from interaction with the real-world rules. The main idea is to learn from mistakes through an objective function which depends on a benefit-loss ratio which should be maximized by the best decision taken from the observed state.
Developing an efficient interaction loop requires the suitable combination of an environment simulator, a machine learning model for decision taking and a suitable objective function to lead the method to the optimal decision in each state. The expected contribution of this internship is the construction of a deep neural network architecture (e.g. recurrent neural
network [4]) for which the input is a set heterogeneous data related to vehicle sensors information received from a previously developed autonomous vehicle prototype [1].
An evaluation protocol will be proposed to objectively compare the new proposal with the previous state-of-the-art methods in realistic scenarios.

Required profile:
— good machine learning and signal processing knowledges
— mathematical understanding of the formal background
— excellent programming skills (Python, C++, Cuda)
— good motivation, high productivity and methodical works
— an interest for AI and new technologies

References:
[1] S. Glaser, B. Vanholme, S. Mammar, D. Gruyer, and L. Nouveliere. Maneuver-based trajectory planning for highly autonomous vehicles on real road
with traffic and driver interaction. IEEE Transactions on Intelligent Transportation Systems, 11(3) :589–606, Sep. 2010.
[2] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural
information processing systems, pages 1097–1105, 2012.
[3] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553) :436–444, 2015.
[4] Larry R Medsker and LC Jain. Recurrent neural networks. Design and Applications, 5, 2001.
[5] Ahmad EL Sallab, Mohammed Abdou, Etienne Perot, and Senthil Yogamani. Deep reinforcement learning framework for autonomous driving. Electronic
Imaging, 2017(19) :70–76, 2017.

Contact: Dominique Fourer (dominique.fourer@univ-evry.fr) and Lydie Nouveliere (lydie.nouveliere@univ-evry.fr)

 

Dans cette rubrique

(c) GdR 720 ISIS - CNRS - 2011-2020.