Skip to main content

One post tagged with "variance-reduction"

View All Tags

Variance Reduction: The Secret to Fast FL Convergence

· 10 min read

Why does federated learning take so many communication rounds to converge?

A typical FL training job might require:

  • Standard SGD: 1,000+ rounds to converge
  • With variance reduction: 100-200 rounds to converge
  • Result: 5-10× speedup in wall-clock time

Variance reduction is the algorithmic technique that makes this possible. It's the difference between federated learning being a research curiosity and a production-viable technology.

This post dives into variance reduction methods—MARINA, PAGE, SAGA, and their variants—and explains why they're fundamental to efficient federated learning.