
Octomil
Octomil is the fastest way to run AI models on phones, laptops, and browsers. One command for local inference. One SDK to ship to devices. One dashboard to monitor quality.
Quickstart
Get up and running with your first model or deploy to a phone.
Download
Download Octomil on macOS, Windows, or Linux.
Cloud
Dashboard for device fleet, inference metrics, and model versions.
API Reference
View Octomil's API reference.
SDKs
Python SDK
Model registry, rollouts, routing, and fleet observability.
iOS SDK
On-device inference and training with CoreML.
Android SDK
On-device inference and training with TFLite.
Browser SDK
Run models in the browser with WebGPU and WASM.