Skip to main content

One post tagged with "model-cascading"

View All Tags

Federated LLMs: Prompting, Cascading, and Fine-Tuning at Scale

· 11 min read

Large Language Models have changed everything—including federated learning.

The old FL paradigm: Train a small model (~100M parameters) from scratch across devices.

The new FL paradigm: Adapt a massive pre-trained model (7B-70B parameters) using federated techniques.

But LLMs bring unique challenges to federated learning:

  • Size: 7B parameters = 28 GB (won't fit on most devices)
  • Compute: Full fine-tuning requires massive GPU memory
  • Inference cost: Running LLM inference on-device drains battery
  • Privacy: LLM memorization can leak training data

This post explores cutting-edge techniques for federated LLMs, from Virginia Smith's research group and beyond, showing how to make federated learning work in the foundation model era.