Annonce

Les commentaires sont clos.

Stage M2 - Non-stationary and robust Reinforcement Learning methodologies for drones detection

16 Novembre 2023


Catégorie : Stagiaire


Detailed information about this stage can be found here:

https://l2s.centralesupelec.fr/wp-content/uploads/fortunati-stefano/Internship_proposal_IPSA.pdf

 

Contacts:

Leila Gharsalli, EC IPSA, leila.gharsalli@ipsa.fr .

Stefano Fortunati, EC IPSA/L2S, stefano.fortunati@l2s.centralesupelec.fr .

 

Reinforcement Learning (RL) methodologies are currently adopted in different context requiring sequential decision-making tasks under uncertainty. The RL paradigm is based on the perception-action cycle, characterized by the presence of an agent that senses and explores the unknown environment, tracks the evolution of the system state and intelligently adapts its behavior in order to fulfill a specific mission. This is accomplished through a sequence of actions aiming at optimizing a pre-assigned performance metric (reward). Despite of their wide applicability, classical RL algorithms are based on a cumbersome assumption: the stationarity of the environment, i.e. the statistical and physical characterization of the scenario, is assumed to be time-invariant. This assumption is clearly violated in surveillance application, where the position and the number of targets, along with the statistical characterization of the disturbance may change over time. To overcome this limitation and include the non-stationarity in the RL framework, both theoretical and application-oriented non-stationary approaches have been proposed recently in the RL literature. The application of these non-stationary-based line of research to robust radar detection problems has been recently investigated. The aim of this internship is then to support and complete the ongoing research activity by testing and validating the non-stationary RL algorithms on several realistic scenarios where the radar acts as an agent that continuously senses the unknown environment (i.e., targets and disturbance) and consequently optimizes transmitted waveforms in order to maximize the probability of detection (PD) by focusing the energy in specific range-angle cells. Due to their crucial strategical interest, particular attention will be devoted to scenarios containing drones.