training

📄️ Aggregate device training updates into a new model version using the specified FL strategy. Loads the base version PyTor

Aggregate device training updates into a new model version using the specified FL strategy. Loads the base version PyTorch artifact from storage, runs FedAvg/FedProx/etc., saves PyTorch + ONNX artifacts, and creates a new ModelVersion row. If a completed SecAgg session exists for this model, uses the secure-aggregated result directly instead of loading raw device updates.

📄️ Cancel a running or pending round. Selected devices stop local training on their next sync; any in-flight aggregation is

Cancel a running or pending round. Selected devices stop local training on their next sync; any in-flight aggregation is discarded.

📄️ Create a new federated training round. Round starts in 'pending' until devices are selected (training.select_devices) an

Create a new federated training round. Round starts in 'pending' until devices are selected (training.select_devices) and the round is explicitly started (training.start_round).

📄️ List federated training rounds for the caller's org / federations.

List federated training rounds for the caller's org / federations.

📄️ Mark a training job as complete with final metrics and optional adapter artifact reference. Transitions the job to a ter

Mark a training job as complete with final metrics and optional adapter artifact reference. Transitions the job to a terminal state on the server.

📄️ Create a new on-device training job for a specific device. Called by the device agent when the device accepts a round of

Create a new on-device training job for a specific device. Called by the device agent when the device accepts a round offer or initiates personalization. Returns the job with its initial state.

📄️ Device reports a state change or progress update for an active training job. Used for streaming progress (e.g. epoch com

Device reports a state change or progress update for an active training job. Used for streaming progress (e.g. epoch completion) and intermediate state transitions during training.

📄️ Fetch a training round by id.

Fetch a training round by id.

📄️ Get a device's personalized model state for a specific model + federation pair. Returns both the state record (null if n

Get a device's personalized model state for a specific model + federation pair. Returns both the state record (null if no state exists yet) and a boolean indicating whether a personalized weights artifact is stored.

📄️ Create or update a device's personalized model state. Upserts the record keyed on (device_id, model_id, federation_id).

Create or update a device's personalized model state. Upserts the record keyed on (device_id, model_id, federation_id). Returns HTTP 200 on both create and update.

📄️ List all client personalization states for a federation. Returns the most recent per-device personalized model metadata

List all client personalization states for a federation. Returns the most recent per-device personalized model metadata including weights storage key, metrics, and the personalization strategy in use.

📄️ Get a device's personalization metrics (local accuracy, loss, etc.) for a specific model and federation. Returns only th

Get a device's personalization metrics (local accuracy, loss, etc.) for a specific model and federation. Returns only the metrics subset — use training.personalized.get for the full state including weights_key.

📄️ Get a lightweight progress snapshot of a training round. Returns counts of selected and received devices, status, and de

Get a lightweight progress snapshot of a training round. Returns counts of selected and received devices, status, and deadline without the full device list. Prefer this over training.get_round for polling during an active round.

📄️ Select eligible devices for a training round based on optional criteria filters. The round must be in 'pending' state. A

Select eligible devices for a training round based on optional criteria filters. The round must be in 'pending' state. After device selection, call training.start_round to begin collecting updates.

📄️ SecAgg+ Stage 2. Client submits masked and quantized model parameter vectors. Server stores vectors. When all active cli

SecAgg+ Stage 2. Client submits masked and quantized model parameter vectors. Server stores vectors. When all active clients have submitted, the server advances to Stage 3 (unmask). Returns HTTP 202 Accepted.

📄️ SecAgg+ Stage 0. Client submits ECDH public keys. Server initializes the protocol session on first call and returns the

SecAgg+ Stage 0. Client submits ECDH public keys. Server initializes the protocol session on first call and returns the session configuration and the client's neighbour graph. All participating clients must call this before the round can advance to Stage 1 (share-keys).

📄️ SecAgg+ Stage 1. Client submits encrypted Shamir share bundles (one per neighbour). Server stores submissions and forwar

SecAgg+ Stage 1. Client submits encrypted Shamir share bundles (one per neighbour). Server stores submissions and forwards shares destined for this client. When all active clients have submitted, the server advances the session to Stage 2.

📄️ Query the current SecAgg+ protocol stage and participant counts for a training round. Returns live state from the in-mem

Query the current SecAgg+ protocol stage and participant counts for a training round. Returns live state from the in-memory protocol session when available; falls back to the persisted SecAggSession DB row otherwise.

📄️ SecAgg+ Stage 3. Surviving clients submit Shamir shares for dropout recovery. When threshold shares have been received,

SecAgg+ Stage 3. Surviving clients submit Shamir shares for dropout recovery. When threshold shares have been received, the server unmasks and aggregates the weight vectors. Returns 'waiting' until threshold is reached, then 'completed' or 'failed'.

📄️ Transition a round from 'pending' / 'selected' to 'running'. Selected devices receive the round assignment on their next

Transition a round from 'pending' / 'selected' to 'running'. Selected devices receive the round assignment on their next devices.sync; they execute local training and POST updates via training.weights.

📄️ Return per-device training update counts and cumulative sample totals for a model version. Groups updates by device and

Return per-device training update counts and cumulative sample totals for a model version. Groups updates by device and returns the devices with the most recent activity first.

📄️ List the most recent training update records for a model version, sorted by received_at descending. Useful for inspectin

List the most recent training update records for a model version, sorted by received_at descending. Useful for inspecting recent device submissions before triggering aggregation.

📄️ Get aggregate training update counts and sample totals for a specific model version. Returns total update count, total s

Get aggregate training update counts and sample totals for a specific model version. Returns total update count, total sample count, and the timestamp of the most recent update. Use this before triggering aggregation to check if enough updates exist.

📄️ Upload a device's local training update (gradient or full weight delta) for later aggregation. The payload is validated

Upload a device's local training update (gradient or full weight delta) for later aggregation. The payload is validated for NaN/Inf before storage. If the model belongs to a federation, the device must be an active federation member.