FAQ

General

What is Octomil? Octomil is the control plane for on-device AI. It helps teams deploy models to devices, route traffic between local and cloud paths, roll changes out safely, and monitor fleet behavior in production.

How is Octomil different from Flower? Flower is a research framework for federated learning. Octomil is a production platform for deploying and operating AI across device fleets -- with a dashboard, mobile-first SDKs, staged rollouts, smart routing, A/B testing, and compliance presets.

What model formats are supported? PyTorch, ONNX, CoreML, and TFLite. Octomil handles conversion: PyTorch -> ONNX -> CoreML (iOS) / TFLite (Android). See Supported Models.

Setup & Configuration

How do I get started? Follow the Quickstart. You will have a model running locally or on a device in under 10 minutes.

Do I need a cloud account to use Octomil? No. octomil serve <model> runs inference locally with no account needed. Cloud features (device fleet management, rollouts, routing) require an Octomil account.

How do I configure compliance presets? Run octomil init "Your Org" --compliance hipaa --region us during setup, or apply a preset later via the dashboard. See Compliance.

Deployment & Routing

How do rollouts work? Octomil supports staged rollouts with configurable cohort percentages, automatic health checks, and one-click rollback. You can canary a new model version to 5% of devices before ramping to full fleet.

How does smart routing work? The routing engine evaluates device capabilities (memory, accelerator, runtime) and routes inference on-device when hardware supports it, falling back to cloud when it does not. You control the routing policy per model.

Does my data ever leave the device during on-device inference? When inference runs locally, input data stays on-device. Only telemetry (latency, error rates, quality scores) is reported to the control plane for fleet monitoring. When the routing engine falls back to cloud, the request is sent to a hosted endpoint — see smart routing for how fallback is configured.

Privacy & Security

Is Octomil HIPAA compliant? Octomil is architecturally incapable of violating HIPAA's core data protection requirements because it never handles PHI. BAA execution is available on the Enterprise tier. See Compliance.

What privacy features are available? On-device execution keeps user data local by default. Enterprise tiers add differential privacy, secure aggregation, and audit logging for regulated workloads. See Privacy Guide.

SDKs

Which platforms are supported? Python, iOS (Swift + CoreML), Android (Kotlin + TFLite), and Browser (WebGPU + WASM). See the SDK docs.

Can I use the OpenAI client library with Octomil? Yes. Octomil's local inference server (octomil serve) exposes an OpenAI-compatible API at http://localhost:8080/v1. Use any OpenAI client library with base_url pointed to your local server.

Enterprise: Training & Federated Learning

Does Octomil support federated learning? Yes, as an enterprise add-on. Federated learning enables model improvement across device fleets without centralizing data. Training data stays on-device; only model weight deltas are transmitted.

What aggregation strategies are available? FedAvg, FedProx, FedOpt, FedAdam, Krum, MultiKrum, FedMedian, FedTrimmedAvg, and SCAFFOLD. See Advanced FL Configuration and Advanced FL Strategies.

How many devices do I need for federated training? A minimum of 3 devices per round is recommended. For Byzantine-robust strategies, ensure at least 3x your estimated number of unreliable clients.

Billing & Plans

Is there a free tier? Yes. The free tier includes local inference with no limits. Cloud features (device fleet management, rollouts, routing) include a generous free allocation.

How do I upgrade to Enterprise? Contact team@octomil.com for Enterprise pricing, federated learning, BAA execution, dedicated support, and custom SLAs.

General​

Setup & Configuration​

Deployment & Routing​

Privacy & Security​

SDKs​

Enterprise: Training & Federated Learning​

Billing & Plans​