Countering Assaults in Federated Learning

Federated Learning (FL) refers to the method used to train AI models. Rather than sending all sensitive data to a central server, FL keeps data localized and only shares updates for the models. This approach enhances privacy and allows AI to operate closer to where the data is generated.

That said, the distribution of computation and data across multiple devices introduces new security vulnerabilities. Malicious actors can infiltrate the training process and subtly manipulate it, resulting in reduced accuracy, biased outcomes, or the introduction of hidden backdoors in the model.

In this initiative, we aimed to explore how to detect and counter such attacks in FL. To achieve this, we developed a multi-node simulator that allows researchers and industry professionals to replicate attacks and assess defenses more effectively.

Importance of This Research

  • An Everyday Example: Imagine a communal recipe book where chefs from various restaurants contribute. Each chef tweaks a few recipes based on their expertise. A dishonest chef might intentionally insert incorrect ingredients to ruin a dish or subtly add a secret flavor only they can rectify. Without careful scrutiny of the recipes, diners in all restaurants could end up with compromised meals.
  • A Technical Illustration: This parallels how data poisoning (manipulating training examples) and model poisoning (altering weight updates) manifest in FL. Such attacks are particularly harmful when the federation is characterized by non-IID data distributions, unbalanced data partitions, or late-joining clients. Current defenses such as Multi KRUM, Trimmed Mean, and Divide and Conquer may still fall short in certain contexts.

Creating the Multi-Node FL Attack Simulator

To assess the resilience of federated learning against real-world threats, we constructed a multi-node attack simulator based on the Scaleout Systems FEDn framework. This simulator allows for the reproduction of attacks, testing of defenses, and scaling of experiments with hundreds or thousands of clients in a controlled setting.

Key Features:

  • Adaptable Deployment: Executes distributed FL jobs using Kubernetes, Helm, and Docker.
  • Realistic Data Environments: Supports IID/non-IID label distributions, unbalanced data partitions, and late-joining clients.
  • Attack Injection: Implements common poisoning attacks (Label Flipping, Little is Enough) and allows for easy definition of new attacks.
  • Defense Assessment: Integrates existing aggregation techniques (FedAvg, Trimmed Mean, Multi-KRUM, Divide and Conquer) and facilitates the experimentation of various defensive strategies and aggregation rules.
  • Scalable Experimentation: Simulation parameters, such as the number of clients, malicious share, and participation patterns, can be adjusted from a single configuration file.

By utilizing the FEDn architecture, simulations benefit from robust training orchestration and client management, along with visual monitoring through the Studio web interface.

Additionally, it’s worth noting that the FEDn framework supports Server Functions, enabling the implementation of new aggregation strategies to be evaluated using the attack simulator.

To get started with the initial example project using FEDn, refer to this quickstart guide.

The FEDn framework is available free for all academic, research projects, and for industrial testing and trials.

The attack simulator is accessible and ready to use as open-source software.

Types of Attacks Investigated

  • Label Flipping (Data Poisoning) – Malicious clients alter labels in their local datasets, such as changing “cat” to “dog” to lower model accuracy.
  • Little is Enough (Model Poisoning) – Attackers make minor yet targeted adjustments to their model updates, steering the global model output toward their objectives. In this study, we implemented the Little is Enough attack every third round.

Exploring Unintentional Impacts Beyond Attacks

While this analysis primarily focuses on intentional attacks, it also sheds light on the repercussions of unintentional contributions stemming from misconfigurations or device failures in large-scale federations.

Using the recipe analogy, an honest chef may accidentally use the wrong ingredient due to a malfunctioning oven or inaccurate scale. Though the mistake is unintentional, it may still adversely affect the shared recipe if repeated by multiple contributors.

In cross-device or fleet learning environments where thousands or millions of diverse devices contribute to a communal model, issues like faulty sensors, outdated configurations, or unstable connections can hinder model performance similarly to malicious attacks. Examining attack resilience also highlights how to enhance aggregation rules against unintentional noise.

Explaining Mitigation Strategies

In FL, aggregation rules determine how to consolidate model updates from clients. Robust aggregation strategies aim to lessen the impact of outliers, whether resulting from malicious attacks or device errors. Here are the strategies we’ve evaluated:

  • FedAvg (baseline) – Averages all updates without any filtering. Highly susceptible to attacks.
  • Trimmed Mean (TrMean) – Sorts each parameter across clients, discarding the highest and lowest values before averaging. Reduces the influence of extreme outliers but may overlook subtle attacks.
  • Multi KRUM – Evaluates each update based on its proximity to nearest neighbors in parameter space, retaining those with the smallest total distance. Highly sensitive to the number of updates selected (k).
  • EE Trimmed Mean (Newly Developed) – A dynamic version of TrMean that employs epsilon-greedy scheduling to determine when to evaluate different client subsets. More resilient to changing client behaviors, late arrivals, and non-IID distributions.

The tables and plots presented in this post were initially designed by the Scaleout team.

Experimental Results

Through 180 experiments, we assessed various aggregation strategies under different attack types, malicious client ratios, and data distributions. For comprehensive insights, please refer to the full thesis here.

The table above illustrates one series of experiments using a label-flipping attack with non-IID, partially imbalanced data distributions. It displays Test Accuracy and Test Loss AUC, calculated across all participating clients. Results for each aggregation strategy are presented in two rows, corresponding to two late-policies (either benign clients joining from the 5th round or malicious clients joining from the 5th round). Results are categorized by three levels of malicious participation, yielding six experimental setups per aggregation strategy. The top results in each setup are highlighted in bold.

Even though the table suggests a uniform response across all defense strategies, individual plots reveal a different narrative. In FL, while a federation might achieve a specific accuracy level, it’s crucial to scrutinize client participation—specifically, which clients contributed effectively to training and which were identified as malicious. The following plots demonstrate client participation under various defense strategies.

Fig-1: TrMean – Label Flipping – non-IID Partially Imbalanced – 20% Malicious activity

With 20% malicious clients involved in a label-flipping attack on non-IID, partially imbalanced data, Trimmed Mean (Fig-1) maintained overall accuracy but did not fully exclude any client from contributing. While coordinate trimming diminished the effect of malicious updates, it filtered parameters individually instead of excluding entire clients, enabling both benign and malicious contributors to remain in the aggregation throughout the training process.

In a scenario featuring 30% late-joining malicious clients and non-IID, imbalanced data, Multi-KRUM (Fig-2) inadvertently acknowledged a malicious update starting from round 5. Significant data heterogeneity resulted in benign updates appearing less similar, permitting the malicious update to rank among the most central and persist in one-third of the aggregated model for the remainder of the training period.

Fig-2: Multi-KRUM – Label Flipping Attack – non-IID Imbalanced – 30% Malicious Activity (k=3)*

The Need for Adaptive Aggregation Strategies

Existing robust aggregation strategies predominantly rely on static thresholds to determine which client updates to incorporate into the new global model. This limitation makes them vulnerable to late-joining clients, non-IID data distributions, or disparities in data volume among clients. These insights prompted the development of EE-Trimmed Mean (EE-TrMean).

EE-TrMean: An Epsilon-Greedy Aggregation Strategy

EE-TrMean builds upon the traditional Trimmed Mean but introduces an exploration vs. exploitation layer based on an epsilon-greedy approach for client selection.

  • Exploration Phase: All clients are permitted to contribute, and a standard Trimmed Mean aggregation is conducted.
  • Exploitation Phase: Clients that have been trimmed the least will proceed into the exploitation phase, utilizing a scoring system based on their participation from previous rounds.
  • Transitioning between the two phases is governed by the epsilon-greedy policy with a decaying epsilon and an alpha ramp.

Every client earns a score based on whether its parameters survive the trimming process in each round. Over time, the algorithm will increasingly favor the highest-scoring clients, while intermittently exploring others to monitor behavioral shifts. This adaptive methodology allows EE-TrMean to enhance resilience in instances of high data heterogeneity and malicious activity.

Fig-3: EE-TrMean – Label Flipping – non-IID Partially Imbalanced – 20% Malicious activity

In a label-flipping scenario involving 20% malicious clients and late benign joiners on non-IID, partially imbalanced data, EE-TrMean (Fig-3) alternated between exploration and exploitation phases—initially allowing all clients to contribute, then selectively blocking those with low scores. While it occasionally excluded a benign client due to data heterogeneity (still outperforming well-known strategies), it effectively minimized the influence of malicious clients throughout training. This straightforward yet powerful enhancement optimizes client contributions. Literature indicates that provided the majority of clients remain honest, the model’s accuracy can be sustained.



Source link

Alex Parker

Alex Parker is a tech enthusiast and digital tools reviewer with over a decade of experience exploring software solutions that boost productivity. He specializes in file management, conversion technologies, and emerging AI-driven applications, helping readers choose the right tools for their needs.