Skip to main content

Training Rounds API

Endpoints for starting training rounds, checking status, and submitting device updates.

Retrieve current round

GET /api/v1/training/rounds/current

Returns the most recent training round regardless of status. Returns 404 if no rounds have been created yet.

Response

{
"id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"round_number": 18,
"status": "active",
"target_devices": 10,
"min_devices": 8,
"participated_devices": 6,
"completed_devices": 4,
"failed_devices": 1,
"global_model_path": "s3://octomil-models/global/round_17.pt",
"aggregated_model_path": null,
"started_at": "2025-07-01T12:15:00Z",
"completed_at": null,
"timeout_at": "2025-07-01T12:20:00Z",
"created_at": "2025-07-01T12:15:00Z"
}

Retrieve round status

GET /api/v1/training/rounds/{round_id}/status

Returns the full status of a training round including participant progress and timing.

Response Fields

FieldTypeDescription
idstring (uuid)Round UUID
round_numberintegerSequential counter
statusstringpending, active, aggregating, completed, or failed
target_devicesintegerDevices selected for this round
min_devicesintegerMinimum updates required
participated_devicesintegerDevices that have begun local training
completed_devicesintegerDevices that have submitted updates
failed_devicesintegerDevices that failed during training
global_model_pathstring | nullS3 path to the distributed global model
aggregated_model_pathstring | nullS3 path to the aggregated result (null until completed)
started_atstring | nullWhen the round became active
completed_atstring | nullWhen aggregation finished
timeout_atstring | nullWhen the round will auto-fail

Start a round

POST /api/v1/training/rounds/start

Initializes a new federated learning training round. The server selects devices, sets the round to active, and distributes the current global model. Returns 409 Conflict if a round is already in progress.

Parameters

ParameterTypeRequiredDescription
target_devicesintegerYesNumber of devices to select
min_devicesintegerYesMinimum updates required before aggregation
timeout_secondsintegerNoSeconds before timeout. Default: 300. Range: 60-3600
global_model_idstring (uuid)NoModel to distribute. Default: latest aggregated

Request

curl -X POST https://api.octomil.com/api/v1/training/rounds/start \
-H "Authorization: Bearer $OCTOMIL_API_KEY" \
-H "Content-Type: application/json" \
-d '{"target_devices": 10, "min_devices": 8, "timeout_seconds": 300}'

Submit an update

POST /api/v1/training/rounds/{round_id}/updates

Upload locally-trained model weights for the given round. Returns 409 Conflict if the round is no longer accepting updates or if this device already submitted.

Body (multipart/form-data)

FieldTypeRequiredDescription
filebinaryYesSerialized model weights file
num_samplesintegerYesNumber of local training samples used
local_lossnumberNoFinal local training loss
local_accuracynumberNoFinal local accuracy (0.0-1.0)
training_time_secondsintegerNoWall-clock training duration in seconds

Response (201 Created)

{
"id": "c3d4e5f6-a7b8-9012-cdef-345678901234",
"round_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"device_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"model_path": "s3://octomil-models/updates/round_18/device_a1b2c3d4.pt",
"model_size_bytes": 4521984,
"num_samples": 5000,
"local_loss": 0.3421,
"local_accuracy": 0.8915,
"training_time_seconds": 47,
"created_at": "2025-07-01T12:18:30Z"
}

Errors

StatusErrorDescription
400bad_requestInvalid or missing request fields
401unauthorizedMissing or invalid API key
404not_foundResource does not exist
409conflictResource already exists or state conflict
429rate_limitedToo many requests; check Retry-After
500internal_errorUnexpected server error