Octomil API
Welcome to the Octomil API. Use the switcher below to choose the right surface for your use case.
- Local Inference (OpenAI-compatible)
- Control Plane (Models/Devices/Rollouts)
- Base URL:
http://localhost:8080/v1 - Use for: prompt/response inference on local or edge runtimes
- Primary endpoint: Inference
Start locally:
octomil serve phi-4-mini
- Base URL:
https://api.octomil.com/api/v1 - Use for: model lifecycle, deployments, device management, rollouts, federated ops
- Start here: API Reference
Authentication:
curl https://api.octomil.com/api/v1/status \
-H "Authorization: Bearer sk_live_your_api_key_here"
Get Started
If you're just getting started, follow the quickstart.
Libraries
Octomil local inference is compatible with OpenAI client libraries.
- Python
- JavaScript
pip install openai
import openai
client = openai.OpenAI(
base_url="http://localhost:8080/v1",
api_key="not-needed",
)
npm install openai
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "http://localhost:8080/v1",
apiKey: "not-needed",
});
Control Plane Details
Authentication
All control-plane endpoints (except device registration) require Authorization: Bearer <token>.
Store your API key securely
Device API keys are returned once during registration and cannot be retrieved later.
Rate Limiting
Authenticated control-plane endpoints are rate-limited to 100 requests per minute per device.
Error Shape
{
"error": "bad_request",
"message": "Validation failed: 'device_id' is required.",
"status_code": 400
}
| Code | Error | Description |
|---|---|---|
400 | bad_request | Invalid or missing request fields |
401 | unauthorized | Missing or invalid API key |
404 | not_found | Resource does not exist |
409 | conflict | Resource already exists or state conflict |
429 | rate_limited | Too many requests; check Retry-After |
500 | internal_error | Unexpected server error |