Les commentaires sont clos.

Cifre PhD position IDEMIA+ENSEA: Federated learning with non-IID data

12 Avril 2023

Catégorie : Doctorant

Two open PhD positions (Cifre) in the exciting field of federated learning (FL) are opened in a newly-formed joint IDEMIA and ENSEA research team working on machine learning and computer vision. We are seeking highly motivated candidates to develop robust FL algorithms that can tackle the challenging issues of data heterogeneity and noisy labels. The successful candidates will work towards making FL a more practical and efficient solution for real-world applications, with a particular focus on face recognition and related areas. As a PhD candidate in our group, you will have the opportunity to be involved in cutting-edge research and contribute to the development of novel algorithms that can have a significant impact on society. You will benefit from the expertise of our team as well as access to state-of-the-art resources and facilities.



Due to the increasing amount of data and the need for privacy preservation, the indiscriminate transmission and aggregation of data are no longer viable due to the high cost of bandwidth and the risk of privacy breaches. As a solution, a new approach called federated learning (FL) [1,3] has emerged to replace the traditional centralized learning paradigm and ensure data privacy. It has been successfully applied in different real-world tasks, such as health care [2] and smart city [3].

One of the remaining challenges in FL is the statistical heterogeneity, i.e., the data in clients are non-identically and independently distributed (non-IID). Since each client collects data based on their own preferences/constraints, their data distribution can vary significantly. Stochastic gradient descent (SGD)-based algorithms rely on the assumption of IID sampling of the training data to ensure that the stochastic gradient provides an unbiased estimate of the full gradient. Therefore, in the case of non-IID FL, minimizing the local empirical loss conflicts with minimizing the global loss. As a result, non-IID data often leads to divergent local models, degradation in performance, and slow convergence of the global model. Therefore, it is critical to adopt appropriate fusion and learning techniques to combine clients’ models effectively. This is especially important for ensuring the robust implementation of FL.

In this project, we will explore and employ advanced aggregation methods and learning techniques to enhance the performance of the global model in non-IID settings. By leveraging innovative strategies such as model distillation, meta-knowledge condensation, and transfer-/meta- learnings, we aim to overcome the challenges posed by non-IID data and develop a more robust and reliable FL system.

Existing model aggregation methods often rely on constraining the direction of local model updates to align the local and global optimal points [4]. While these methods aim to make the local models consistent with the global model, their aggregation techniques are often simplistic. Knowledge distillation (KD) has emerged as an effective solution to improve model aggregation efficiency. Originally designed for model compression, KD [5] uses a teacher-student paradigm to learn a lightweight student model using knowledge distilled from one or more powerful teachers. When applied in FL to tackle client heterogeneity [6], KD techniques treat each client model as a teacher and distill its information into the student (global) model to improve its generalization performance. However, these methods overlook the incompatibility of local knowledge and may cause the forgetting of knowledge in the global model [7, 8].

To address these mentioned limitations, this project proposes a multi-faceted approach. Firstly, we will develop algorithms that dynamically correct the local drift of the gradient to reduce its impact on the global objective. This leads to faster convergence and better performance of the global model. Secondly, we consider enhancing the KD-based model aggregation methods by extracting more comprehensive knowledge from clients, including hard data samples, gradient distribution, and feature prototypes (as in our recent work [16]). Additionally, we will intend to use dataset and meta knowledge condensation techniques [10, 11, 12, 13] to further enhance the aggregation process. Thirdly, to align heterogeneity, we will take into account the differences in knowledge distribution among clients, which can provide a valuable cue. To achieve this, various similarity measures, such as earth mover's distance (EMD), centered kernel alignment (CKA), and procrustes distance-based measures [9], can be used to compare and contrast different representations and distributions. Furthermore, we will investigate the effectiveness of the recent strong technique of re-basin [14] for merging and aligning different models.

In real-life FL, another scenario related to non-IID data is federated continual learning [15]. This scenario involves local clients collecting new data with new classes continuously, while having limited storage memory to store old classes, and new clients with unseen new classes may participate in FL training. Additionally, the training and test data are not always from the same distribution, resulting in domain shift, which leads to catastrophic forgetting at both the local clients and the global model. To address these challenges, we can continue our recent work [17] that successfully addressed the problem of centralized (local) forgetting. Our goal is to develop new techniques that can handle both local and global forgetting caused by non-IID class imbalance across clients, while improving the performance and robustness of the global model. Specifically, we will investigate how to mitigate the impact of domain shift by developing robust techniques for knowledge distillation and feature extraction across clients with different data distributions. We will also explore methods for model adaptation and transfer learning to improve the performance of the global model.



  • Strong background in computer science, mathematics, or a related field.
  • Candidates with experience in machine learning, computer vision, and/or signal processing are especially encouraged to apply.
  • Proficiency programming skills (Python, Tensorflow, Pytorch)


Further information

  • If candidates wish, an internship or a fixed-term contract (CDD) can be considered while waiting for the PhD position to start in September/October 2023.
  • The position will be based on the Cergy (Ensea) and/or Courbevoie (Idemia) sites.
  • Competitive salary
  • Advisors: Ngoc-Son Vu, Damien Monet, Stephane Gentric, Aymeric Histace
  • How to apply: please send your CV, motivation letter, two reference letters, and grades (in English or French) to Ngoc-Son Vu (, and/or via

IDEMIA leads global identity and security, empowering you to assert your identity in a safe and simple future. Our world-class products serve finance, telecom, retail, government, and more. We use cutting-edge technology to deliver top-quality services to agencies and tech companies, impacting citizens and nations worldwide.

ETIS is a joint research department of ENSEA, CY Cergy Paris University, and CNRS. Represented by ENSEA, a top French electrical engineering and computing science graduate school, ETIS employs over 150 researchers and contributes to numerous EU and French-funded projects in AI and machine learning.



[1] Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, et al. “Communication-efficient learning of deep networks from decentralized data”, ICML 2017

[2] Quande Liu, Cheng Chen, Jing Qin, Qi Dou, and Pheng-Ann Heng. “Feddg: Federated domain generalization on medical image segmentation via episodic learning in continuous frequency space”, CVPR 2021.


[4] L. Zhang, L. Shen, L. Ding, D. Tao, L. Duan, “Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning”, CVPR 2022

[5] G. Hinton, O. Vinyals, J. Dean, “Distilling the knowledge in a neural network” arXiv:1503.02531, 2015

[6] D. Li, J. Wang, “FedMD: Heterogeneous Federated Learning via Model Distillation”, NeurIPS Workshop 2019

[7] S. Karimireddy, S. Kale, M. Mohri, S. Reddi, S. Stich, and A. T. Suresh. “Scaffold: Stochastic controlled averaging for federated learning”. In ICML 2020.

[8] Q. Li, B. He, and D. Song, “Model contrastive federated learning”. In CVPR 2021

[9] F. Ding, J.-S. Denain, and J. Steinhardt, “Grounding representation similarity through statistical testing”, NeuRIPS, 2021

[10] T. Dong, B. Zhao, L. Lyu, “Privacy for Free: How does Dataset Condensation Help Privacy?”, ICML 2022.

[11] Yuanhao Xiong, Ruochen Wang, Minhao Cheng, Felix Yu, Cho-Jui Hsieh, “FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning”, CVPR 2023

[12] Ping Liu, Xin Yu, Joey Tianyi Zhou, “Meta Knowledge Condensation for Federated Learning”, ICLR 2023


[14] S. Ainsworth, J. Hayase, S. Srinivasa, “Git Re-Basin: Merging Models modulo Permutation Symmetries”, ICLR 2023

[15] J. Dong et al. “Federated Class-Incremental Learning”, CVPR 2022


Some references of the group

[16] L. Jezequel, N.-S. Vu, J. Beaudet, A. Histace, “Anomaly Detection via Multi-Scale Contrasted Memory”. preprint 2023

[17] J. Pourcel, N.-S. Vu, R.M. French, “Online Task-free Continual Learning with Dynamic Sparse Distributed Memory”, ECCV 2022

[18] J.-R. Conti, N. Noiry, S. Clémençon, V. Despiegel, S. Gentric, “Mitigating Gender Bias in Face Recognition using the von Mises-Fisher Mixture Model”, ICML 2022

[19] L. Jezequel, N.-S. Vu, J. Beaudet, A. Histace, “Hyperbolic Adversarial Learnable Tasks for Anomaly Detection via Multi-Scale Contrasted Memory”. preprint 2023

[20] R. Marriott, S. Romdhani, L. Chen, “A 3D GAN for improved large-pose facial recognition”, CVPR 2021

[21] J.-R. Conti, S. Clémençon, “Assessing Performance and Fairness Metrics in Face Recognition - Bootstrap Methods”, TSRML, NeuRIPS 2022

[22] N. Larue, N.-S. Vu, V. Struc, P. Peer, V. Christophides, “SeeABLE: Soft Discrepancies and Bounded Contrastive Learning for Exposing Deepfakes”, preprint 2023

[23] L. Jezequel, N.-S. Vu, J. Beaudet, A. Histace, “Efficient anomaly detection using self-supervised multi-cue tasks”. IEEE Trans. Image Processing 2023