Vous êtes ici : Accueil » Kiosque » Annonce


Mot de passe : 

Mot de passe oublié ?
Détails d'identification oubliés ?


5 juillet 2017

Detection and analysis of audio-visual non-verbal behavioral events "in the wild"

Catégorie : Doctorant

Position overview

3-Year PhD position in computer vision in the Jean Monnet UniversitySaint-Etienne in collaboration with INRIA Grenoble Rhône-Alpes: Detection and analysis of audio-visual non-verbal behavioral events "in the wild".

Starting date: Autumn 2017

Application deadline date: 31/08/2017

Decision announcement date: September 15th 2017


Non-verbal behavioral events include facial expressions, body postures and gestures, vocal outbursts and social

signals, such as mutual gaze and interpersonal distance [V09]. While there is a non-negligible amount of literature on social signal processing in laboratory conditions and/or single-person communicative situations, the understanding of these events in multi-party natural interplays using a robotic platform is a largely unexplored field of research.

This project will be lead by the computer vision team of Hubert Curien laboratory, UMR-CNRS 5516, Jean Monnet University, Saint-Etienne, in collaboration with the Perception team of INRIA Grenoble Rhône-Alpes. It will be granted by muDialbot (MUlti-party situated DIALog for roBOT-patient interaction) ANR project which has the main objective to study the interaction of a robot (Pepper from Softbank robotics) with humans in a waiting room of a day-care hospital.

PhD thesis subject

The Pepper robot is equipped with multiple sensors: RGB video, depth video and audio. From these data, the objectives of this PhD thesis is to develop multi-modal fusion strategies able to extract and analyse non-verbal behavioral events when the robot is interacting with multiple persons at a distance ranging from one to four meters. We will emphasize the development of strategies able to exploit the temporal dynamics of the social interplay to robustly extract such cues. The methodological starting point will be at the crossroads of deep neural architectures [1] and dynamic probabilistic models like Gaussian Process Dynamical Models (GPDM). Such combination will benefit from the representational power of deep neural architectures and from the robustness and flexibility of probabilistic models, thus being truly suited for multi-party interaction analysis in real-world scenarios.


We are looking for a motivated student holding a Master degree (on the 1st of October 2017) in the field of computer science (or computer vision) with strong skills in applied mathematics.

Good background in software development (C++, Matlab or Python) and good English skills are required. Knowledge in image processing, video processing, signal processing and machine learning would also be appreciated.


Net salary: around 1400 euros without teaching activities and around 1650 euros with teaching activities (64 hours per year).

Application process

Your application should include the following documents:



[1] “Github awesome deep vision,”https://github.com/kjw0612/awesome-deep-vision.

[V09] A. Vinciarelli, M. Pantic, and H. Bourlard, “Social signal processing: Survey of an emerging domain,” Image and Vision Computing, vol. 27, pp. 1743–1759, 2009.


Dans cette rubrique

(c) GdR 720 ISIS - CNRS - 2011-2015.