Les commentaires sont clos.

Human action recognition using fusion of spatio-temporal data and scene interpretation

31 Août 2023

Catégorie : Stagiaire

Host laboratory: Connaissance et Intelligence Artificielle Distribuées (CIAD) – Environment Perception and Autonomous Navigation Team (EPAN)–

Keywords: Human action recognition, classification, video data, deep learning, scene interpretation, robots, autonomous vehicles.

Contacts : Abderrazak Chahi (, Yassine Ruichek (


Description of the internship topic:

Human action recognition in video sequences is a particularly difficult problem due to the variations in the visual and motion of people and actions, changing camera viewpoint, moving backgrounds, occlusions, noise, and the enormous amount of video data. Detecting and understanding human activity or video motion is essential for a variety of applications, such as video surveillance and anomaly detection in crowded scenes, and safe and cooperative interaction between humans and robots in shared workspaces. Action and motion recognition can also be used in intelligent and/or autonomous vehicles to detect driver behavior and improve road safety. Over the past decade, significant progress has been made in action and motion recognition using spatiotemporal representation of video sequences, optical flow information, and fusion of the two. The objective of this project is to develop new machine learning approaches that address the fusion of spatiotemporal information and scene model understanding to produce a state-adaptive representation of the scene. The scene state understanding model will extract situation data (interpretation, context, circumstances, etc.) related to the different states and conditions in the scene. The proposed approach is to collaborate an intermediate recognition of the scene with one or more scene understanding models. The intermediate recognition can be performed using classical image processing/classification methods or advanced techniques such as deep learning approaches. The experiments and analysis of the results will be carried out on video data, which is widely used by the scientific community in this field. We also plan to apply the developed methods in at least one of the experimental platforms of the laboratory (automated vehicles and robots equipped with perception and localization sensors and communication interfaces).


  • Huillcen Baca, Herwin Alayn, Juan Carlos Gutierrez Caceres, and Flor de Luz Palomino Valdivia. "Efficiency in Human Actions Recognition in Video Surveillance Using 3D CNN and DenseNet." Future of Information and Communication Conference. Springer, Cham, 2022.
  • Ullah, Waseem, et al. "Artificial Intelligence of Things-assisted two-stream neural network for anomaly detection in surveillance Big Video Data." Future Generation Computer Systems 129 (2022).
  • Sun, Zehua, et al. "Human action recognition from various data modalities : A review." IEEE transactions on pattern analysis and machine intelligence (2022).
  • Mazzia, Vittorio, et al. "Action Transformer: A self-attention model for short-time pose-based human action recognition." Pattern Recognition 124 (2022).
  • Xiao, Weichu, et al. "Attention-based deep neural network for driver behavior recognition." Future Generation Computer Systems 132 (2022).

Candidate Profile :

  • Holder or in the process of preparing a Master’s degree in computer science, computer vision, machine learning, robotics or related field.
  • Advanced knowledge and practice in object-oriented programming (C++, Python) and machine learning tools (deep learning platforms: Pytorch, TensorFlow, Matlab) are required.
  • Knowledge in ROS framework will be appreciated.
  • Advanced level in English writing and speaking is required.

Application (CV, scores, reference letters, …) to Abderrazak Chahi (, Yassine Ruichek ( - Deadline: October 31, 2023

Starting date:February 2024