[Master II Internship] Sensor-Robust Dynamic Neural Radiance Fields: \\ Synthetizing New Views From 100 Years of Archival Aerial Images
30 Novembre 2021
Catégorie : Stagiaire
We propose to adapt the recent neural radiance field-based methods to the case of image time series spanning over a hundred years. This poses the challenge of dealing with vastly different sensors and conditions of acquisitions asndcomplex dynamics such as time of day, day of the year, and slow urban change. The end result would be the ability to generate high-definition, colored free-viewpoint renders representing the evolution of the French territory.
Sensor-Robust Dynamic Neural Radiance Fields:
Synthetizing New Views From 100 Years of Archival Aerial Images
- Laboratory: LASTIG, Univ Gustave Eiffel, IGN-ENSG (STRUDEL and GeoVIS teams)
- Localisation: IGN, Saint Mandé, France
- Supervision: Loic Landrieu, PhD; Mathieu Brédif, PhD
- Remuneration: 513 euros / month
- Starting Date: April 2022, up to 6 month
- Key Words:Deep Learning, Neural Radiance Fields, Inverse Problems, New View Synthesis,Digital Heritage
- Development Environment:Linux, Python, PyTorch.
IGN is the public institution in charge of the production and distribution of geographical information in France. LASTIG, the research lab associated with IGN, has privileged access to over a hundred years of archival aerial data. These images have been taken across the French territory, with various sensors as photography technology evolved: analog, digital, black and white, RGB and infrared, etc. The images have already been geographically aligned and are accessible through the open-data platform: See examples of aligned images at https://remonterletemps.ign.fr/telecharger. Some regions have been photographed nearly every year for the last century, and landmarks are visible at different angles and with varying qualities.
Neural Radiance Fields (NeRFs) are a recent development in view synthesis , and are capable of generating novel views of objects from a set of images at various poses, without any 3D supervision. This work has been adapted to reconstruct views of city scenes from off-nadir satellite images by explicitly taking into account the sun illumination . However, all these works make several hypotheses that are not true in our case: (i) the images come from identical sensors (ii) the scenes are static, except for transient objects, such as cars.
Recent work has proposed to add a temporal dimension to NeRFs [3,4], allowing for NeRFs to encode dynamic scenes. However, their formulation is ill-suited to capture the multiple temporal dynamics in century-long image time series (daylight, seasonal, transient objects, urban change). Adapting NeRFs to a dynamic, multi-sensor setting would allow us to recreate the changing landscape of the French territory across the last century as free-view point videos.
This internship aims to adapt the current NeRF-based methods to temporal sequences of aligned aerial images taken with various sensors and across the last century. This poses several challenges:
Sensor Robustness as an Inverse Problem: The last century saw dramatic technological advances in photography. As a consequence, the radiometric quality varies a lot across images. We propose to select a given acquisition as pivot modality (ideally RGB, high resolution, low distortion, consistent illumination) and to learn transformations from this pivot modality to the different acquisition dates. This can entail lowering the resolution, converting to monochrome, adding radial distortion, and so on. The pixelwise radiometric supervision of the NeRF will then take place through this transformation (i.e., an inverse problem setting).
Multi-Scale Dynamics: Sequences of images spanning a century are subject to several temporal dynamics: (i) the time of day/ day of the year are hugely influential through the illumination conditions, but also through seasonal changes (vegetation, snow). The proposed approach must be able to model this influence. (ii) Transient objects such as cars or pedestrians can appear in some images. These objects should not be rendered. (iii) The urban landscape evolves slowly: vegetalisation, densification, and so on. A carefully designed dynamic NeRF would be able to capture these changes.
Results: Provided that all challenges can be solved, we would be able to train dynamic NeRFs from sequences of historical images, allowing us to generate free view-point high-quality RGB videos retracing the evolution of urban landscapes across the century.
If the internship is successful, we will write an article on the subject and release both code and datasets in open access.
The tentative planning of the internship is as follows:
- Month 1-2. Bibliography on NeRFs; curation of a set of stable scenes; tuning of transformation operators; training sensor-robust NeRFs with the inverse problem formulation.
- Month~3-4. Curation of a set of dynamic scenes; designing dynamic NeRFs able to capture urban dynamics.
- Month~5-6. Rendering of complex scenes; curation of a larger dataset; extensive comparison with baseline methods (e.g., comparison with ground truth height from LiDAR); writing the article.
Opportunity. The intern will have a privileged opportunity to postulate to LASTIG's Ph.D. offers.
- Master~2 student in computer science, applied mathematics, or remote sensing.
- Familiarity with computer vision, machine learning, and deep learning.
- Mastery of Python, familiarity with PyTorch;
- Curiosity, rigor, motivation;
- (Optional) Familiarity with (differential) renderings;
- (Optional) Experienced with aerial/satellite images or image time series.
Send a CV and a short letter of purpose (~20 lines max) stating your interest in this internship and the relevance of your experience to loic.landrieu AT ign.fr and mathieu.bredif AT ign.fr.