Annonce

Les commentaires sont clos.

CIFRE thesis proposal: Leveraging geometric clues for accurate 3D modeling of outdoor scenes with neural rendering

28 Novembre 2022


Catégorie : Doctorant


Huawei Technologies France, in collaboration with University Gustave Eiffel and the french Institut National de l'Information Géographique et Forestière (IGN) propose the following 3 years PhD research topic: "Leveraging geometric clues for accurate 3D modeling of outdoor scenes with neural rendering". The objectif of this thesis is to push the boundaries of neural implicit reconstruction (NeRF-like methods) for accurate 3D geometry reconstruction in the domain of the autonomous vehicle.

 

Thesis Proposal:

Leveraging geometric clues for accurate 3D modeling of outdoor scenes with neural rendering

Background:

This PhD is a CIFRE fellowship between Huawei, the CoSys department of University Gustave Eiffel and the french Institut national de l'information géographique et forestière (IGN).

Huawei is working on key components of L2-L3 autonomous driving platform and is progressively shifting focus to the development of breakthrough technologies required for L4-L5 levels. Tomorrow self-driving cars powered by AI will combine edge and cloud computing with vast number of sensors to safely drive customers and deliver merchandise. At Huawei, we develop realistic simulators created from crowd-sourced data to continuously improve localization, perception and prediction algorithms of autonomous vehicles. We are seeking the best candidates for a CIFRE PhD with a background in computer vision, deep learning, simulation, computer graphics, mapping, perception, sensor fusion, cognition and other related areas, to work as a part of IoV team in Paris Research Center (PRC). As a member IoV PRC you will closely work with multiple teams worldwide to grow your expertise and successfully transfer your research results into real products.

Created in 2020, the University Gustave Eiffel brings together a research institute, a university, a school of architecture and three engineering schools focusing on the study of urban areas. The Cosys department (Components & Systems) focuses on urban mobility, from the design to the evaluation of innovative systems likely to improve urban experience. In this Department, the PICS-L lab has a strong experience on vision issues, including ADAS and autonomous vehicles applications.

The French mapping agency IGN (National Institute for Geographic and Forest Information) is a public administrative establishment attached to the French Ministry of Ecological Transition; it is the national reference operator for mapping the French territory. The LaSTIG* Laboratory in Sciences and Technologies of Geographic Information for the smart city and sustainable territories, is a joint research unit attached to the Gustave Eiffel University, the IGN and the Paris Engineering School (EIVP). It is a unique research structure in France and even in Europe, bringing together around 80 researchers, who cover the entire life cycle of geographic or spatial data, from its acquisition to its visualization, including its modeling, integration and analysis; among them about thirty researchers work in image analysis, computer vision, machine learning, photogrammetry and remote sensing.

Research topic:

Implicit 3D scene representation and neural rendering have shown outstanding results in novel view synthesis and realistic simulation in the past years [1]. Such approaches are based on a continuous representation of the 3D environment and thus need to describe as closely as possible the underlying geometrical structure of the scene. Particularly, proper modeling of the 3D geometry of the scene is crucial to render new data th

Thesis Proposal:

Leveraging geometric clues for accurate 3D modeling of outdoor scenes with neural rendering

Background:

This PhD is a CIFRE fellowship between Huawei, the CoSys department of University Gustave Eiffel and the french Institut national de l'information géographique et forestière (IGN).

Huawei is working on key components of L2-L3 autonomous driving platform and is progressively shifting focus to the development of breakthrough technologies required for L4-L5 levels. Tomorrow self-driving cars powered by AI will combine edge and cloud computing with vast number of sensors to safely drive customers and deliver merchandise. At Huawei, we develop realistic simulators created from crowd-sourced data to continuously improve localization, perception and prediction algorithms of autonomous vehicles. We are seeking the best candidates for a CIFRE PhD with a background in computer vision, deep learning, simulation, computer graphics, mapping, perception, sensor fusion, cognition and other related areas, to work as a part of IoV team in Paris Research Center (PRC). As a member IoV PRC you will closely work with multiple teams worldwide to grow your expertise and successfully transfer your research results into real products.

Created in 2020, the University Gustave Eiffel brings together a research institute, a university, a school of architecture and three engineering schools focusing on the study of urban areas. The Cosys department (Components & Systems) focuses on urban mobility, from the design to the evaluation of innovative systems likely to improve urban experience. In this Department, the PICS-L lab has a strong experience on vision issues, including ADAS and autonomous vehicles applications.

The French mapping agency IGN (National Institute for Geographic and Forest Information) is a public administrative establishment attached to the French Ministry of Ecological Transition; it is the national reference operator for mapping the French territory. The LaSTIG* Laboratory in Sciences and Technologies of Geographic Information for the smart city and sustainable territories, is a joint research unit attached to the Gustave Eiffel University, the IGN and the Paris Engineering School (EIVP). It is a unique research structure in France and even in Europe, bringing together around 80 researchers, who cover the entire life cycle of geographic or spatial data, from its acquisition to its visualization, including its modeling, integration and analysis; among them about thirty researchers work in image analysis, computer vision, machine learning, photogrammetry and remote sensing.

Research topic:

Implicit 3D scene representation and neural rendering have shown outstanding results in novel view synthesis and realistic simulation in the past years [1]. Such approaches are based on a continuous representation of the 3D environment and thus need to describe as closely as possible the underlying geometrical structure of the scene. Particularly, proper modeling of the 3D geometry of the scene is crucial to render new data that are far from the training distribution used to optimize the rendering model [2]. This PhD thesis will focus on leveraging innovative methods to enforce the geometrical consistency in the modeling of an outdoor urban area. This learned representation will ideally be capable of rendering extrapolated realistic views at any position of the scene.

Classical approaches use regularization methods based on structural hypothesis on the 3D scene to recover better geometry from sparse input views [2, 4]. These self-supervised constrains have shown to be effective in low data regime but fail to produce realistic structures for large outdoor scenes. The main research direction we want to explore in this PhD is to extend geometric regularization to large scale outdoor scene. Promising approaches involve:

  • Finding better regularization method such as in [3, 13]. In [3], the authors demonstrated that reasoning on surface normals instead of raw depths leads to a better 3D structure estimation. On the other hand, the optimization proposed in [3] is not tractable in term of computation on large scenes, and generalizes poorly to complex outdoor structures. This question is important to study.
  • The use of semantic priors extracted from 2D contents [12]. Injecting high semantic priors (such as “a road should be flat”, “buildings facades should be perpendicular to the road surface”, etc.) or estimated geometric guidance extracted from 2D images [11] can help to efficiently fit the outdoor scene. Even though these approaches have shown impressive results in indoor scene modeling, their use in automotive outdoor scenarios is still a challenge that need to be tackled.

In addition to the aforementioned directions, other approaches could be considered to enforce the 3D consistency of a reconstructed scene. In [5], the authors propose to use unstructured 3D information, such as point clouds [6], as a support for modeling the environment. Recent works explore other structured implicit representations such as the Signed Distance Function [7] or mesh [8] to lead the optimization process and improve the geometry estimation. Even though these approaches have shown promising results, their use are still limited to indoor bounded scene with no illumination variance or dynamic distracters. Finally, the fusion of multi-modal data may help to enforce the geometric consistency of reconstructed scene. Seminal works have been leveraging sparse [9, 10] and dense [11] 3D information in order to improve the overall quality of the outputs. The use of LIDAR data [10] seems to be promising for NeRF based outdoor reconstruction, such data being usually available from sensors of autonomous vehicles.

[1] Mildenhall, Ben, et al. "Nerf: Representing scenes as neural radiance fields for view synthesis." Communications of the ACM 65.1 (2021): 99-106.

[2] Niemeyer, Michael, et al. "Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

[3] Ehret, Thibaud, Roger Marí, and Gabriele Facciolo. "NeRF, meet differential geometry!." arXiv preprint arXiv:2206.14938 (2022).

[4] Kim, Mijeong, Seonguk Seo, and Bohyung Han. "InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

[5] Vasu, Subeesh, et al. "HybridSDF: Combining Free Form Shapes and Geometric Primitives for effective Shape Manipulation." arXiv preprint arXiv:2109.10767 (2021).

[6] Xu, Qiangeng, et al. "Point-nerf: Point-based neural radiance fields." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

[7] Yu, Zehao, et al. "MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction." arXiv preprint arXiv:2206.00665 (2022).

[8] Chen, Zhiqin, et al. "Mobilenerf: Exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures." arXiv preprint arXiv:2208.00277 (2022).

[9] Deng, Kangle, et al. "Depth-supervised nerf: Fewer views and faster training for free." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

[10] Rematas, Konstantinos, et al. "Urban radiance fields." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

[11] Roessle, Barbara, et al. "Dense depth priors for neural radiance fields from sparse input views." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

[12] Zhi, Shuaifeng, et al. "In-place scene labelling and understanding with implicit scene representation." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021.

[13] Chen, Zheng, et al. "StructNeRF: Neural Radiance Fields for Indoor Scenes with Structural Hints." arXiv preprint arXiv:2209.05277 (2022).

 

Description of research activities:

·Study the state of the art on 3D reconstruction with implicit neural representation using geometric regularization, especially in outdoor environments with both LiDAR and images.

·Identify the bottlenecks in using implicit neural representation in an outdoor environment.

·Propose new solutions for challenging outdoor scenes by extending geometric regularization to large scale outdoor scenes.

·Research and develop algorithms based on the proposed solutions

·Apply the proposed algorithm to the domain of self-driving cars using existing or specifically collected datasets

·Publish research results in top journals and conferences and participate to scientific seminars

Supervision:

This PhD will be supervised jointly between Huawei Technologies France, the LaSTIG laboratory of IGN (Paris area) and the Cosys department (PICS-L lab) of the Université Gustave Eiffel.

 
Prerequisites:

The candidate should be motivated to carry out world class research and should have a Master in Computer Science, with a focus on Vision and/or Robotics. He/She should have solid skills in the following domains:

·Implement Code in Python, C++ (CUDA is a plus)

·Apply or use existing libraries for deep learning in project related tasks (pytorch is a plus)

·Good knowledge in Computer Vision, Computer Graphics, 3D reconstruction and robotics

·Good knowledge in Git, ROS, OpenCV, Boost, multi-threading, CMake, Make and Linux systems

·Code and algorithm documentation

·Project reporting and planning

·Writing of scientific publications and participation in conferences

·Fluency in spoken and written English; French and/or Chinese is a plus

·Intercultural and coordination skills, hands-on and can-do attitude

·Interpersonal skills, team spirit and independent working style

Contact:

Nathan Piasco (Huawei) – nathan.piasco@huawei.com
Roland Brémond (thesis advisor, Univ. GE) – roland.bremond@univ-eiffel.fr
Laurent Caraffa (IGN) – laurent.caraffa@ign.fr
 

Application deadline:

31/01/2023
 

Application Files:

To send in a single PDF file: CV + motivation letter + transcript of records for academic years 2021-2022 and 2022-2023 + any other relevant documents.

at are far from the training distribution used to optimize the rendering model [2]. This PhD thesis will focus on leveraging innovative methods to enforce the geometrical consistency in the modeling of an outdoor urban area. This learned representation will ideally be capable of rendering extrapolated realistic views at any position of the scene.

Classical approaches use regularization methods based on structural hypothesis on the 3D scene to recover better geometry from sparse input views [2, 4]. These self-supervised constrains have shown to be effective in low data regime but fail to produce realistic structures for large outdoor scenes. The main research direction we want to explore in this PhD is to extend geometric regularization to large scale outdoor scene. Promising approaches involve:

  • Finding better regularization method such as in [3, 13]. In [3], the authors demonstrated that reasoning on surface normals instead of raw depths leads to a better 3D structure estimation. On the other hand, the optimization proposed in [3] is not tractable in term of computation on large scenes, and generalizes poorly to complex outdoor structures. This question is important to study.
  • The use of semantic priors extracted from 2D contents [12]. Injecting high semantic priors (such as “a road should be flat”, “buildings facades should be perpendicular to the road surface”, etc.) or estimated geometric guidance extracted from 2D images [11] can help to efficiently fit the outdoor scene. Even though these approaches have shown impressive results in indoor scene modeling, their use in automotive outdoor scenarios is still a challenge that need to be tackled.

In addition to the aforementioned directions, other approaches could be considered to enforce the 3D consistency of a reconstructed scene. In [5], the authors propose to use unstructured 3D information, such as point clouds [6], as a support for modeling the environment. Recent works explore other structured implicit representations such as the Signed Distance Function [7] or mesh [8] to lead the optimization process and improve the geometry estimation. Even though these approaches have shown promising results, their use are still limited to indoor bounded scene with no illumination variance or dynamic distracters. Finally, the fusion of multi-modal data may help to enforce the geometric consistency of reconstructed scene. Seminal works have been leveraging sparse [9, 10] and dense [11] 3D information in order to improve the overall quality of the outputs. The use of LIDAR data [10] seems to be promising for NeRF based outdoor reconstruction, such data being usually available from sensors of autonomous vehicles.

[1] Mildenhall, Ben, et al. "Nerf: Representing scenes as neural radiance fields for view synthesis." Communications of the ACM 65.1 (2021): 99-106.

[2] Niemeyer, Michael, et al. "Regnerf: Regularizing neural radiance fields for view synthesis from sparse inputs." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

[3] Ehret, Thibaud, Roger Marí, and Gabriele Facciolo. "NeRF, meet differential geometry!." arXiv preprint arXiv:2206.14938 (2022).

[4] Kim, Mijeong, Seonguk Seo, and Bohyung Han. "InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

[5] Vasu, Subeesh, et al. "HybridSDF: Combining Free Form Shapes and Geometric Primitives for effective Shape Manipulation." arXiv preprint arXiv:2109.10767 (2021).

[6] Xu, Qiangeng, et al. "Point-nerf: Point-based neural radiance fields." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

[7] Yu, Zehao, et al. "MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction." arXiv preprint arXiv:2206.00665 (2022).

[8] Chen, Zhiqin, et al. "Mobilenerf: Exploiting the polygon rasterization pipeline for efficient neural field rendering on mobile architectures." arXiv preprint arXiv:2208.00277 (2022).

[9] Deng, Kangle, et al. "Depth-supervised nerf: Fewer views and faster training for free." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

[10] Rematas, Konstantinos, et al. "Urban radiance fields." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

[11] Roessle, Barbara, et al. "Dense depth priors for neural radiance fields from sparse input views." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

[12] Zhi, Shuaifeng, et al. "In-place scene labelling and understanding with implicit scene representation." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021.

[13] Chen, Zheng, et al. "StructNeRF: Neural Radiance Fields for Indoor Scenes with Structural Hints." arXiv preprint arXiv:2209.05277 (2022).

 

Description of research activities:

·Study the state of the art on 3D reconstruction with implicit neural representation using geometric regularization, especially in outdoor environments with both LiDAR and images.

·Identify the bottlenecks in using implicit neural representation in an outdoor environment.

·Propose new solutions for challenging outdoor scenes by extending geometric regularization to large scale outdoor scenes.

·Research and develop algorithms based on the proposed solutions

·Apply the proposed algorithm to the domain of self-driving cars using existing or specifically collected datasets

·Publish research results in top journals and conferences and participate to scientific seminars

Supervision:

This PhD will be supervised jointly between Huawei Technologies France, the LaSTIG laboratory of IGN (Paris area) and the Cosys department (PICS-L lab) of the Université Gustave Eiffel.

 
Prerequisites:

The candidate should be motivated to carry out world class research and should have a Master in Computer Science, with a focus on Vision and/or Robotics. He/She should have solid skills in the following domains:

·Implement Code in Python, C++ (CUDA is a plus)

·Apply or use existing libraries for deep learning in project related tasks (pytorch is a plus)

·Good knowledge in Computer Vision, Computer Graphics, 3D reconstruction and robotics

·Good knowledge in Git, ROS, OpenCV, Boost, multi-threading, CMake, Make and Linux systems

·Code and algorithm documentation

·Project reporting and planning

·Writing of scientific publications and participation in conferences

·Fluency in spoken and written English; French and/or Chinese is a plus

·Intercultural and coordination skills, hands-on and can-do attitude

·Interpersonal skills, team spirit and independent working style

Contact:

Nathan Piasco (Huawei) – nathan.piasco@huawei.com
Roland Brémond (thesis advisor, Univ. GE) – roland.bremond@univ-eiffel.fr
Laurent Caraffa (IGN) – laurent.caraffa@ign.fr
 

Application deadline:

31/01/2023
 

Application Files:

To send in a single PDF file: CV + motivation letter + transcript of records for academic years 2021-2022 and 2022-2023 + any other relevant documents.