Discovery and modelling of interactions between users and transport infrastructure by the analysis of the video surveillance streams
Supervisors: Sebastien Ambellouis (Researcher - email@example.com) and Stephane Lecoeuche (Pr. - firstname.lastname@example.org)
The objective is to discover patterns of activities and interactions by automatically and progressively analysing the audio and video streams.
From an operational point of view, new algorithms will be evaluated on audio/video databases for the interpretation of streams acquired on-board a railway vehicles and at level crossing for security application and in urban areas for safety of vulnerable road users.
Keywords: Incremental Learning, Bayesian Inference, Data Mining, Dictionary learning, Image processing, Railway users security, VRU security, Activities and interactions discovering and modelling
Behaviour modelling of individuals and their interactions thanks to the analysis of audio / video streams acquired from surveillance systems is a very active research topic in the scientific community. It is directly related to "Big data" problem, common to many application areas such as the health monitoring of the elderly people and security/safety problems. In the transport field, the objective is to improve the security, the safety, operational activities and comfort of individual or collective transport networks.
Whether monitoring a subway station, the grounds of a passenger train, the emergency hall of a hospital or a shopping center, the operational objective is to recognize a particular activity, trends or to detect abnormal events. To achieve this, the literature offers two angles of study: the angle supervised or unsupervised angle. The first category requires the establishment of a so-called learning step during which a pattern of behavior to recognize each estimated. This learning is based on the extraction of a set of characteristics (spatial or spatio-temporal) defining the space in which is observed each activity and then by a step of segmentation of each of them in the same space. Several modeling means then used to generate a probabilistic way each model (GMM HMM) or ensure their discrimination (SVM). More recently, deep learning techniques were used to develop a hierarchical machine learning in which each layer is a growing level of complexity of information. These deep learning methods have been applied in supervised and unsupervised context.
In the second category, methods propose to discover patterns of activities and interactions by automatically and progressively analysing the flow. The methods rely on the production of a "word dictionary" from a set of characteristics and descriptors extracted from the streams and then rely on the way in which they are associated related to the behaviours and the interactions present in the scene. Some techniques (LDA - Latent Dirichlet Allocation, PISA - Probability Latent Semantic Analysis, PLSM - probabilistic Latent Sequential Patterns) operate a "semantic" analysis of the content and produce "topics model" that can remove many ambiguities brought by the dictionary.
This subject is related to the methods of the second category. The selection of features (SIFT, SURF, region, optical flow etc.) and descriptors is important since they have to ensure the construction of the most relevant dictionary given the behaviours and interactions to be discovered and new ones. The first part of the work will therefore focus on methods of learning and re-learning parsimonious dictionaries able to describe the observations with a "reduced" number of atoms. The second part of the work will be to segment the flow taking into account the simultaneous presence of multiple objects interacting tight interactions or while travelling. The challenge will be to integrate the multi-temporal and multi-scale properties of what is happening in the monitored scene, each object in the scene with his own life more or less correlated with that of one or more of its neighbours. The last phase of work will focus on the formalization of an "abnormal" event to ensure its detection and localization.
The candidates will benefit from the previous works of the two supervisor labs. It will benefit in particular the results of the use of "topic model" in the context of the detection of dangerous situations at the crossing [CUI13]. It will build on the work on incremental learning techniques of non-stationary data [AMA06] or on the identification of hybrid dynamic systems applied to the segmentation of the semantic content of cinematic sequences [BOU11]. From an operational point of view, new algorithms will be evaluated on audio/video databases for the interpretation of streams acquired on-board a railway vehicles (for security application), in urban areas (for safety of vulnerable road users) and at the level crossing.
[AMA06] H. Amadou Boubacar, Classification synamique de données non-stationnaires - apprentissage et suivi de classes évolutives, PhD, Ecole des Mines de Douai, 2006
[BOU11] K. Boukharouba, Modélisation et classification de comportements dynamiques des systèmes hybrides, PhD, Ecole des Mines de Douai, 2011
[CUI13] Y. Cui, Decouverte automatique d'activites dans les sequences d'images par topic model, Master Thesis, Ifsttar, 2013
(c) GdR 720 ISIS - CNRS - 2011-2015.