Annonce

Les commentaires sont clos.

Taking non-verbal events into account to improve robot behavior - End-of-study Engineer or 2nd year Master internship offer

3 Octobre 2023


Catégorie : Stagiaire


Title: Taking non-verbal events into account to improve robot behavior - End-of-study Engineer or 2nd year Master internship offer

keywords: social robotic, non-verbal events, emotion recognition, context awareness, deep learning, reinforcement learning

Host laboratory: Laboratoire Hubert Curien UMR CNRS 5516, Saint-Etienne, France https://laboratoirehubertcurien.univ-st-etienne.fr/en/index.html

Application: Send the following information to olivier.alata@univ-st-etienne.fr before the 1st of December, 2023

  • Curriculum Vitae,
  • Cover letter,
  • Grades for the two last years,
  • Email address(es) of one (or two) contact persons.

Internship allowances: 567€ for 20 working days (about one month).

 

Title: Taking non-verbal events into account to improve robot behavior

keywords: social robotic, non-verbal events, emotion recognition, context awareness, deep learning, reinforcement learning

Context: During this five- or six-month internship, you'll be taking part in the ANR project "MUlti-party perceptually-active situated DIALog for human-roBOT interaction", or simply "muDialBot". The muDialBot project brings together five partners: the company ERM Automatismes Industriels, the Laboratoire d'Informatique d'Avignon (LIA), the Inria Grenoble Research Center, the AP-HP hospital center, and the Laboratoire Hubert Curien de Saint-Etienne. This consortium is currently developing a computer system, dedicated to the Pepper robot, to improve human-robot interactions, by considering certain aspects of human behavior: non-verbal communication. This can be analyzed from audio recordings, through the way a person speaks (volume of voice, intonation, flow of words, pronunciation, etc.) and the sounds a person makes (laughter, sniffing, etc.), or from images (posture, facial expressions, actions, etc.). A machine capable of extracting and exploiting this information from audiovisual data could therefore adapt to the emotional state of its interlocutor, and thus make conversation more natural. It should be added that two situations are classically defined in relation to the robot's position:

  • The case where the robot is relatively far away from the person(s) (from 2 to 5 m, or even more), known as "far-range". It is possible to access contextual information: subject posture, actions, scenes, objects and so on. In the state of the art, this is known as "Context-Aware Emotion Recognition" [1]–[6].
  • The case where the robot is close to a person, known as "close-range". Image information is mainly limited to facial expression [7].

Objectives: During the internship, your overall aim will be to improve the robot's behavior, mainly by taking better account of visual information. Depending on progress, an "audio" part could be added. To this aim, you will in several areas:

  • Integrate into the system the latest work carried out at the Laboratoire Hubert Curien by a final-year PhD student working on the analysis of human emotions from images, taking context into account [8]. This section should also provide a clear understanding of how the system currently being developed by the LIA in collaboration with ERM and INRIA works.
  • Participate in evaluating the contribution of non-verbal visual information in the current system in well-defined situations of exchange with the robot, which could be linked to "farrange" and "close-range" situations.
  • The current system is based on deterministic rules. We would like to test the implementation of a system capable of learning the rules, using "reinforcement learning" [9], which would allow more latitude in taking non-verbal events into account. This part will be carried out in collaboration with the LIA laboratory.

Host laboratory: Laboratoire Hubert Curien UMR CNRS 5516, Saint-Etienne, France https://laboratoirehubertcurien.univ-st-etienne.fr/en/index.html

Candidate profile: Master or engineering school in computer science, preferably with knowledge of neural networks and deep learning. Experience in Python programming would also be highly appreciated.

Application: Send the following information to olivier.alata@univ-st-etienne.fr before the 1st of December, 2023

  • Curriculum Vitae,
  • Cover letter,
  • Grades for the two last years,
  • Email address(es) of one (or two) contact persons.

Internship allowances: 567€ for 20 working days (about one month).

Bibliography

  1. L. F. Barrett, B. Mesquita, et M. Gendron, « Context in Emotion Perception », Curr Dir Psychol Sci, vol. 20, n o 5, p. 286‑290, oct. 2011, doi: 10.1177/0963721411422522.
  2. C. Chen, Z. Wu, et Y.-G. Jiang, « Emotion in Context: Deep Semantic Feature Fusion for Video Emotion Recognition », in Proceedings of the 24th ACM international conference on Multimedia, New York, NY, USA, oct. 2016, p. 127‑131. doi: 10.1145/2964284.2967196.
  3. J. Lee, S. Kim, S. Kim, J. Park, et K. Sohn, « Context-Aware Emotion Recognition Networks », 2019, p. 10143‑10152. https://openaccess.thecvf.com/content_ICCV_2019/html/Lee_Context-Aware_Emotion_Recognition_Networks_ICCV_2019_paper.html
  4. R. Kosti, J. M. Alvarez, A. Recasens, et A. Lapedriza, « Context Based Emotion Recognition Using EMOTIC Dataset », IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no 11, p. 2755‑2766, nov. 2020, doi: 10.1109/TPAMI.2019.2916866.
  5. T. Mittal, P. Guhan, U. Bhattacharya, R. Chandra, A. Bera, et D. Manocha, « EmotiCon: Context-Aware Multimodal Emotion Recognition Using Frege’s Principle », 2020, p. 14234‑14243. https://openaccess.thecvf.com/content_CVPR_2020/html/Mittal_EmotiCon_Context-Aware_Multimodal_Emotion_Recognition_Using_Freges_Principle_CVPR_2020_paper.html
  6. M.-H. Hoang, S.-H. Kim, H.-J. Yang, et G.-S. Lee, « Context-Aware Emotion Recognition Based on Visual Relationship Detection », IEEE Access, vol. 9, p. 90465‑90474, 2021, doi: 10.1109/ACCESS.2021.3091169.
  7. A. Toisoul, J. Kossaifi, A. Bulat, G. Tsimiropoulos & M. Pantic, "Estimation of continuous valence and arousal levels from faces in naturalistic conditions", Nature Machine Intelligence, vol. 3, Jan. 2021, p. 42–50. https://github.com/face-analysis/emonet
  8. T. Cladière, O. Alata, C. Ducottet, H. Konik a A.-C. Legrand, "BENet: a lightweight bottom-up framework for context-aware emotion recognition", ACIVS 2023, Kumamoto, Japan. https://github.com/TristanCladiere/BENet.git
  9. Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998). https://inst.eecs.berkeley.edu/~cs188/sp20/assets/files/SuttonBartoIPRLBook2ndEd.pdf