Topic description
Context
Federated learning (FL) enables models to learn from distributed datasets across diverse clients (e.g., edge devices, hospitals, or industrial sites) while maintaining privacy [1]. A major challenge in supervised FL is statistical variability, which causes client distribution shifts—differences in data distributions across clients [2]. Additionally, many clients contain unlabeled data that remain unused due to annotation costs, inconsistent labeling protocols, and model updates. As a result, FL with unlabeled data has gained attention, with approaches that take advantage of clustering techniques to correct client shifts [3,4].
Since domain shifts in Domain Adaptation (DA) are analogous to client shifts in federated learning, federated domain adaptation has naturally emerged to transfer methodological advances from DA to address distribution shifts [5]. DA seeks to learn a domain-invariant feature space between labeled source and target domains, whereas unsupervised domain adaptation handles the more challenging case where target data are unlabeled [6]. Recent studies have incorporated DA techniques into FL, primarily in centralized settings (e.g., at the server) to address evolving distribution shifts [7]. Among these methods, Source-Free Domain Adaptation (SFDA) presents a promising direction, as it enables a client to share a trained model with another party, allowing the recipient to perform unsupervised adaptation using their own data in conjunction with the shared model [8].
Finally, performance degradation and poor generalization are further exacerbated by class imbalance in unlabeled data, where majority classes dominate while minority classes are underrepresented. In supervised FL, this issue can be mitigated by estimating clients’ class distributions and applying loss-based adjustments [9]. However, such strategies are not directly applicable in unsupervised settings, where class labels are unavailable.
Objectives
The main objectives of this internship are as follows : first, to get familiar with an already developed federated learning framework that includes the privacy-preserving Source-Free Domain Adaptation (SFDA) approach. Second, to study the effect of class imbalance on performance in federated learning. Third, to investigate methods used in supervised domain adaptation and federated learning to address class imbalance [1,2]. Finally, based on the insights from the previous objectives, the candidate will propose new strategies.
The experiments will be conducted on publicly available time-series datasets, including sensor data from accelerometers, gyroscopes, and body sensors collected from multiple subjects (e.g., Human Activity Recognition and EEG signal datasets). Additionally, experiments may be performed on well-known image classification datasets to further evaluate the proposed approach.
Starting date
Funding category
Public / private mixed funding
Funding further details
Leveraging Domain Adaptation in Unsupervised Federated Learning • Rouen, Normandie, FR