Les commentaires sont clos.

M2 internship; Development of spatiotemporal attention mechanisms for enhanced motion segmentation in video sequences

2 Novembre 2023

Catégorie : Stagiaire

Context and motivation:

Deep learning models proposed for motion segmentation often lack the capability to consider both spatial and temporal information effectively [1-4]. Static image-based models may fail to account for temporal dependencies, while purely temporal models might disregard essential spatial cues. This creates limitations in segmenting objects accurately, especially in scenarios involving object occlusions, motion, and complex interactions.


Proposed Solution:

The proposed internship project aims to bridge the gap by developing a Spatiotemporal Attention Model that jointly considers spatial and temporal information. This model will be integrated into the process of foreground segmentation deep models to improve the quality and robustness of their results. This attention mechanism should adaptively weight the importance of spatial regions and frames within a video sequence. Furthermore, it should highlight regions and frames that are most likely to belong to the foreground.


Training, testing, and evaluation:

  • Collect or utilize existing datasets with annotated foreground and background information in video sequences.
  • Define appropriate evaluation metrics to quantify the improvements in foreground segmentation achieved by the Spatiotemporal Attention Models.
  • Assess the models' ability to handle occlusions, object motion, and complex scenes.
  • Explore practical applications of the enhanced foreground segmentation, such as object tracking, action recognition, and surveillance systems.
  • Investigate the potential of the models to improve the accuracy and reliability of these applications.



Prospective interns should have a strong background in computer vision, deep learning, and experience with deep learning frameworks like PyTorch or TensorFlow. The ability to work with video data and experience in video processing is a plus. Strong programming skills in Python are essential.


Practical information:

  • Location: laboratoire CIAD, Montbéliard, France.
  • This internship is remunerated.



Send a curriculum vitae, referees coordinates, andgrades for the two last years before the 15th of December, 2023 to:



[1] Zheng, W., Wang, K., & Wang, F. Y. (2020). A novel background subtraction algorithm based on parallel vision and Bayesian GANs. Neurocomputing, 394, 178-200.

[2] Tezcan, O., Ishwar, P., & Konrad, J. (2020). BSUV-Net: A fully-convolutional neural network for background subtraction of unseen videos. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 2774-2783).

[3] Mandal, M., Dhar, V., Mishra, A., Vipparthi, S. K., & Abdel-Mottaleb, M. (2020). 3DCD: Scene independent end-to-end spatiotemporal feature learning framework for change detection in unseen videos. IEEE transactions on image processing, 30, 546-558.

[4] Kajo, I., Kas, M., Ruichek, Y., & Kamel, N. (2023). Tensor based completion meets adversarial learning: A win–win solution for change detection on unseen videos. Computer Vision and Image Understanding, 226, 103584.