Advanced FL Strategies
This page covers Byzantine-robust aggregation (defending against malicious or faulty clients) and specialized training objectives (AUC, fairness, tail-risk) for federated learning.
Byzantine Robustness
Byzantine-robust FL protects training against adversarial, faulty, or poisoned client updates. With standard FedAvg, a single malicious client can dominate the aggregated update by sending gradients with 100x the magnitude of honest updates.
Threat model
Byzantine clients deviate from the expected protocol due to malicious intent (crafted updates to degrade the model), data poisoning (corrupted local data), model poisoning (directly manipulated gradients or backdoor injection), or hardware/software faults (garbage updates from memory errors or numerical overflow).
Common attacks include gradient scaling, sign flipping, backdoor injection, label flipping, free-riding, and Sybil attacks.
Robust aggregation strategies
Octomil provides four Byzantine-robust aggregation strategies that replace FedAvg's naive averaging.
Krum selects the single client update closest to its neighbors in parameter space. Tolerates a minority of Byzantine clients. Works best with 10-100 clients.
Multi-Krum extends Krum by selecting the top k closest updates and averaging them. Set k = n - f to select all clients except suspected Byzantine ones.
FedMedian computes the coordinate-wise median of all client updates. No need to specify num_byzantine in advance. Robust to a significant fraction of corrupt values.
FedTrimmedAvg sorts updates along each coordinate, removes the top and bottom beta fraction, and averages the rest. A natural middle ground between FedAvg and FedMedian.
Configuring robust aggregation
from octomil import Federation
federation = Federation(api_key="edg_...", name="my-robust-model")
result = federation.train(
model="my-robust-model",
algorithm="krum", # or "multi_krum", "fedmedian", "fedtrimmedavg"
rounds=500,
min_updates=20,
)
Configure strategy parameters via the REST API:
- cURL
- Python
# Krum
curl -X PUT https://api.octomil.com/api/v1/federations/my-robust-model/strategy \
-H "Authorization: Bearer edg_..." \
-H "Content-Type: application/json" \
-d '{"algorithm": "krum", "num_byzantine": 3, "learning_rate": 0.01, "local_epochs": 5}'
# Multi-Krum
curl -X PUT https://api.octomil.com/api/v1/federations/my-robust-model/strategy \
-H "Authorization: Bearer edg_..." \
-H "Content-Type: application/json" \
-d '{"algorithm": "multi_krum", "num_byzantine": 3, "num_selected": 15, "learning_rate": 0.01}'
# FedMedian
curl -X PUT https://api.octomil.com/api/v1/federations/my-robust-model/strategy \
-H "Authorization: Bearer edg_..." \
-H "Content-Type: application/json" \
-d '{"algorithm": "fedmedian", "learning_rate": 0.01, "local_epochs": 5}'
# FedTrimmedAvg
curl -X PUT https://api.octomil.com/api/v1/federations/my-robust-model/strategy \
-H "Authorization: Bearer edg_..." \
-H "Content-Type: application/json" \
-d '{"algorithm": "fedtrimmedavg", "trim_ratio": 0.1, "learning_rate": 0.01, "local_epochs": 5}'
import requests
# Krum example (same pattern for multi_krum, fedmedian, fedtrimmedavg)
response = requests.put(
"https://api.octomil.com/api/v1/federations/my-robust-model/strategy",
headers={"Authorization": "Bearer edg_..."},
json={
"algorithm": "krum",
"num_byzantine": 3,
"learning_rate": 0.01,
"local_epochs": 5,
},
)
print(response.json())
Combining defenses
Robust aggregation alone is not sufficient for sophisticated attacks. Layer multiple defenses:
Update clipping -- Bound the L2 norm of each client's update before aggregation to prevent gradient scaling attacks:
{"algorithm": "krum", "gradient_clip_norm": 10.0, "num_byzantine": 3}
Anomaly detection -- Monitor per-client update statistics: cosine similarity between each client's update and the aggregate, update norm spikes, and consistent exclusion by Krum or trimming.
Trust scoring -- Assign trust scores based on historical behavior. Clients consistently selected by Krum receive higher trust; consistently trimmed clients receive lower trust. Use Device Groups to segment by trust tier.
Choosing a strategy
| Scenario | Strategy | Rationale |
|---|---|---|
| Unknown threat, moderate client count | Krum | Strongest guarantee, minimal tuning |
| Robustness with less information loss | Multi-Krum | Averages top-k, better convergence |
| High client count, unknown attacker count | FedMedian | No num_byzantine needed, scales well |
| Known attacker fraction, best convergence | FedTrimmedAvg | Retains most information, tunable |
| Low-stakes, mostly trusted fleet | FedAvg + clipping | Clipping alone handles faults |
Performance impact
| Strategy | Server Compute | Communication | Convergence |
|---|---|---|---|
| FedAvg | Baseline | Baseline | Fastest (no attackers) |
| Krum | Higher | None | Slower (1 update) |
| Multi-Krum | Higher | None | Moderate |
| FedMedian | Moderate | None | Moderate |
| FedTrimmedAvg | Moderate | None | Near FedAvg |
Specialized Objectives
Some federated applications need objectives beyond cross-entropy -- AUC optimization, fairness-aware training, or tail-risk minimization.
Why standard losses fail
Cross-entropy optimizes for average-case accuracy, which fails when classes are imbalanced (99.5% benign vs 0.5% fraud) or error costs are asymmetric. In FL, class imbalance is compounded: each client may see an even more extreme skew, and some clients have zero minority-class examples.
AUC optimization
AUC measures ranking ability independent of classification threshold. Octomil supports pairwise surrogate loss and compositional AUC optimization:
- cURL
- Python
curl -X PUT https://api.octomil.com/api/v1/federations/auc-model/strategy \
-H "Authorization: Bearer edg_..." \
-H "Content-Type: application/json" \
-d '{
"algorithm": "fedavg",
"objective": "auc_surrogate",
"objective_config": {"margin": 1.0, "pos_weight": 10.0}
}'
import requests
response = requests.put(
"https://api.octomil.com/api/v1/federations/auc-model/strategy",
headers={"Authorization": "Bearer edg_..."},
json={
"algorithm": "fedavg",
"objective": "auc_surrogate",
"objective_config": {"margin": 1.0, "pos_weight": 10.0},
},
)
Fairness-aware objectives
Minimax fairness minimizes the worst-case loss across clients:
{"objective": "minimax", "objective_config": {"lambda": 0.5, "ema_decay": 0.9}}
Per-group accuracy constraints define minimum accuracy targets per device group:
{
"algorithm": "fedavg",
"fairness_constraints": [
{"device_group": "region-eu", "min_accuracy": 0.85},
{"device_group": "region-apac", "min_accuracy": 0.85}
]
}
Class-weighted aggregation
Clients with more minority-class examples receive higher aggregation weight:
{
"client_weighting": "class_balanced",
"weighting_config": {
"target_distribution": {"class_0": 0.5, "class_1": 0.5},
"smoothing": 0.1
}
}
Tail-risk objectives
For safety-critical applications (medical imaging, autonomous systems, financial models), minimize worst-case loss instead of average loss:
- CVaR (Conditional Value at Risk): optimizes the average loss on the worst
alphafraction of examples - DRO (Distributionally Robust Optimization): finds a model robust to distribution shift
{"objective": "cvar", "objective_config": {"alpha": 0.1, "dual_step_size": 0.01}}
Choosing the right objective
| Application | Objective | Rationale |
|---|---|---|
| Balanced classification | Cross-entropy (default) | Standard, converges reliably |
| Rare-event detection | AUC surrogate + class weighting | Threshold-independent, handles imbalance |
| Fairness-critical | Minimax + per-group constraints | No subpopulation underserved |
| Safety-critical | CVaR or DRO | Minimizes worst-case failures |
| Multi-domain deployment | Ditto personalization | Per-client adaptation with shared knowledge |
Best Practices
- Set
num_byzantineconservatively. Overestimating is safer than underestimating. If you expect 2 bad clients, set to 4-5. - Use FedTrimmedAvg as a production default for robust aggregation. Start with
trim_ratio=0.1. - Stage strategy changes via rollouts. Use Model Rollouts to canary robust aggregation on a subset of devices.
- Start with class weighting before switching objectives. Only move to AUC surrogate if class weighting is insufficient.
- Track per-class and per-group metrics separately. Overall accuracy hides fairness problems.
- Combine specialized objectives with robust aggregation. Imbalanced datasets are more susceptible to poisoning.
- Require minimum client counts. Ensure
min_devices_per_roundis at least 3x yournum_byzantineestimate.