Annonce

Les commentaires sont clos.

PhD thesis - Generating virtual environments for the training of autonomous vehicles

16 Août 2023


Catégorie : Doctorant


Artificial Intelligence (AI) is increasingly involved in the development of autonomous vehicle driving, and is expected to be a major key to achieving automation levels 4 and 5. However, AI-based approaches, in particular deep neural networks, require very large amounts of data in order to build sufficiently reliable and generic models. For many tasks, the lack of data has become a major factor limiting improvement. This lack is justified by the complexity of the process of creating datasets that are sufficiently representative of the task to be learned (acquisition, diversity of situations/scenarios and weather conditions, data labeling, etc.). To address this issue, the aim of this thesis is to develop a method for automatically generating realistic virtual environments by processing and analyzing data from real sensors installed on a vehicle.

 

Context

Artificial Intelligence (AI) is increasingly involved in the development of autonomous vehicle driving, and is expected to be a major key to achieving automation levels 4 and 5. However, AI-based approaches, in particular deep neural networks, require very large amounts of data in order to build sufficiently reliable and generic models. For many tasks, the lack of data has become a major factor limiting improvement. This lack is justified by the complexity of the process of creating datasets that are sufficiently representative of the task to be learned (acquisition, diversity of situations/scenarios and weather conditions, data labeling, etc.). To address this issue, the aim of this thesis is to develop a method for automatically generating realistic virtual environments by processing and analyzing data from real sensors installed on a vehicle.


Research topic

In order to perceive its environment, an autonomous vehicle embeds several sensors (LIDAR, radars, cameras, etc.). The main objective of the thesis project is to generate a highly realistic virtual world of a real environment based on data acquired with the vehicle's various sensors. The whole process can be decomposed into several stages. The first step consists in representing the real environment by a 3D point cloud generated from the vehicle's LIDAR acquisitions. The second step consists in producing a 3D model composed of basic geometric shapes characterized by semantic information. The third step is to generate a representation containing textured 3D models of the static elements (roads, sidewalks, buildings, etc.) in the real scene. Finally, the last step produces a highly realistic representation by adding aesthetic details, refining model geometry and applying surface materials. This pipeline has already been successfully used by our research team to generate realistic animations in a small-scale urban context (https://www.youtube.com/watch?v=UXC4gnYwAXo). However, the transition between each step was carried out manually, so the process remains time-consuming and therefore unsuitable for large-scale virtual worlds. The aim of this thesis is therefore to automate this pipeline as far as possible.

One of the challenges is that LIDAR data are inevitably incomplete. Indeed, due to the position of the sensor, vegetation and non-permanent objects (e.g., moving and parked vehicles) create numerous occlusions. It is therefore impossible to reconstruct the geometry simply by triangulating the point cloud. One solution we intend to explore is to combine 3D LIDAR data with 2D data from the vehicle's on-board cameras. From the merged data, it could be possible to extrapolate building shapes, possibly by adding prior knowledge of architectural models of the acquired location, and reconstruct them either procedurally, or by adapting generative adversarial networks (GANs) to this task. To simulate traffic, you need to know where the road is, how many lanes there are, where the traffic signs are, and, to simulate pedestrians, you also need to extract information about sidewalks, crosswalks and so on. This could be achieved by using semantic segmentation at image level projected onto the point cloud with a combination of 3D recognition. Non-specific elements (road signs, trees, luminaires, street furniture, etc.) can be added automatically to the virtual environment based on extracted position, orientation, volume and semantic information, by instantiating the appropriate 3D models from a generic database. Finally, to achieve a high level of visual fidelity, it is necessary to extract information on the types of materials present on each surface (for example, types of wall paint, types of sidewalk, their shade, etc.). With this type of information, an automated process could be developed to add textures to the models. Finally, finer details such as windows and doors can be added to models without altering the geometry, using 3D graphics tools, for example, additional geometry superimposed on models with textures that blend with the background.

The method will be evaluated, firstly, by measuring the degree of realism of the environments generated through scores of opinions from people (expert or not). It will also be evaluated using data generated by training different deep models (convolutional neural networks).

The thesis project is based on the MOBILITECH platform, and aims, in addition to developing the method for automatically generating realistic virtual environments, to produce and share databases for learning perception and navigation tasks for autonomous vehicles. The platform, consisting of autonomous vehicles equipped with various perception and localization sensors, will be used to acquire the data required for the automatic generation of realistic virtual environments.

Requirements

The candidate must be motivated to carry out world-class research, and must have a Bac+5 in computer science, with a specialization in computer vision, machine learning, robotics or a related specialty.

He/she must have solid skills in the following areas:

- Python, C++, C# programming (GPU programming is a plus)
- Unity 3D (or equivalent)
- Libraries and tools dedicated to computer vision and deep learning
- Good knowledge of computer vision
- Knowledge of the ROS environment would be appreciated
- Good written and spoken English is required.

 

Contact

Connaissance et Intelligence Artificielle Distribuées (CIAD) – http://www.ciad-lab.fr
Équipe Perception de l’Environnement et Navigation Autonome (PENA) – https://epan-utbm.github.io/
Jocelyn Buisson (jocelyn.buisson@utbm.fr)
Nathan Combrez (nathan.crombez@utbm.fr)
Yassine Ruichek (yassine.ruichek@utbm.fr)


Application Files

CV, cover letter, transcripts and any other relevant documents.
Deadline: September 30, 2023