Master II Internship offer at Institut Pascal (UMR 6602 CNRS/Université Clermont auvergne), Clermont Ferrand, France.
Keywords: Computer vision, deep learning, reinforcement learning, curiosity
Subject description: Object detection is a well known computer vision task that can be learnt using supervised deep learning techniques when large datasets are available. However, this object detection learning is usually separate from any knowledge on how the object can interact with the environment and agents. In this internship, we will focus on learning to detect reachable objects through interaction, while learning to reach them. One of the known mechanisms of learning through interaction is reinforcement learning (RL). In RL an agent learns by interacting with its environment to maximize a reward signal. This type of learning incrementally builds associations between the state of the agent and its environment and the actions the agent can perform so as to maximize the sum of rewards obtained over the long-term. In recent years, the combination of reinforcement learning and deep neural networks (DRL) has lead to impressive results in games  and robotics  when some priors are available. However, learning complex robotic tasks remains a challenge since only very specific behaviors may lead to any rewards. Discovering these behaviors by exploring the consequences of random movements is extremely improbable. This suggests that the learning process cannot rely on random exploration but must be structured in intelligently. We will consider an object reaching task using a (simulated) robotic system consisting of a robotic arm, and a binocular pan-tilt head. In our recent work, we have shown how to learn such an object reaching task without any prior knowledge on the robot model, the cameras parameters, and the object position, using DRL in a stage-wise manner: binocular fixation was learnt autonomously  and used to learn hand-eye coordination and object reaching . However, this work required a sequential learning of each of these subtasks and a static environment. The goal of this internship is to allow the system to continuously build a model of what is a reachable object, while learning to reach it with its arm. The detector thus obtained should allow the robot to adapt its knowledge of the world as the environment changes. The first step of the internship will consist of developing and analyzing a detector of reachable objects, trained from partial annotations corresponding to the robot's reaching attempts. Secondly, a “curiosity” mechanism  will be proposed, which will determine from an image the “interesting” areas for the robot and which will also be updated according to the reaching experience of the agent. Finally, this module will be integrated into learning the task of reaching and handeye coordination of the robot, for a simultaneous learning of an object detector and the reaching of objects by the robot. The candidate is expected to have knowledge of computer vision and machine learning, as well as good programming skills in C/C++ or python. Experience with deep learning libraries (tensorflow/keras/PyTorch) and/or robotics simulators (VREP, Gazebo…) are highly desirable.
Laboratory: Institut Pascal /ISPR, Clermont-Ferrand, France Gratification: up to 4500€ total for 5 to 6 months.
Contact: Céline Teulière celine.teuliere@ uca.fr
Bibliography :  Human-level control through deep reinforcement learning, V. Mnih et al.. Nature, vol. 518, p 1529, 26 Feb. 2015
 Learning Hand-Eye Coordination for Robotic Grasping with Deep learning and LargeScale Data Collection, Sergey Levine, Peter Pastor, Alex Krizhevsky, Deirdre Quillen, arXiv 2016
 Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control, Ebert, Finn et al, arXiv 2018
 Learning of Binocular Fixations using Anomaly Detection with Deep Reinforcement Learning, F. de La Bourdonnaye, C. Teulière, J. Triesch, T. Chateau, IJCNN, 2017
 Stage-Wise Learning of Reaching Using Little Prior Knowledge, F. de La Bourdonnaye, C. Teulière, J. Triesch, T. Chateau, Frontiers Robotics AI, 2018
 Guest Editorial Active Learning and Intrinsically Motivated Exploration in Robots: Advances and Challenges. Lopes, Oudeyer et al., IEEE TAMD 2010
(c) GdR 720 ISIS - CNRS - 2011-2020.