Skip to main content

OpenAI-Compatible Integrations

Octomil's /v1 surface is built for teams that already have application code and do not want to rewrite their orchestration layer.

The usual migration path is simple:

  1. Keep your existing client library.
  2. Point it at Octomil's /v1 base URL.
  3. Keep using the same chat and embeddings request shapes.

If you are starting from OpenAI client code, begin with the OpenAI migration guide.

What LangChain can do

LangChain is useful when you want orchestration above the model endpoint:

  • prompt composition and reusable chains
  • tool calling and agent loops
  • retrieval and RAG pipelines
  • vector store integrations such as Chroma
  • application-level state and workflow composition

Octomil fits underneath that layer. LangChain handles orchestration; Octomil handles routing, deployment, rollback, and observability across edge and cloud.

Reference integrations

These examples are published in the Octomil repo and map directly to the current /v1 surface.

1. LangChain agent

Code:

What it shows:

  • ChatOpenAI pointed at Octomil with base_url
  • tool-calling agent behavior
  • no Octomil-specific LangChain adapter required

Run:

python3 -m pip install -r examples/integrations/requirements.txt
python3 examples/integrations/langchain_agent.py \
"Estimate the savings for 250000 monthly requests at 0.45 dollars per 1k calls."

2. LangChain RAG + Chroma

Code:

What it shows:

  • OpenAIEmbeddings pointed at Octomil's /v1/embeddings
  • local Chroma vector store setup
  • retrieval plus chat generation on the same Octomil endpoint

Run:

python3 examples/integrations/rag_chroma.py \
"Why would a team keep Octomil in front of its existing AI stack?"

3. Vercel AI SDK

Code:

What it shows:

  • createOpenAI with a custom baseURL
  • app-facing text generation without changing the surrounding SDK usage pattern

Run:

pnpm add ai @ai-sdk/openai tsx
pnpm exec tsx examples/integrations/vercel_ai_sdk_chat.ts \
"Give me a one-sentence Octomil pitch."

Base URL and environment

Local development (via octomil serve -- no API key needed):

octomil serve gemma3-1b
export OCTOMIL_BASE_URL="http://localhost:8080/v1"
export OCTOMIL_API_KEY="not-needed"
export OCTOMIL_MODEL="gemma3-1b"
export OCTOMIL_EMBED_MODEL="text-embedding-3-small"

Hosted Octomil API:

export OCTOMIL_BASE_URL="https://api.octomil.com/v1"
export OCTOMIL_SERVER_KEY="YOUR_SERVER_KEY"
export OCTOMIL_API_KEY="$OCTOMIL_SERVER_KEY"

Notes

  • Embeddings use POST /v1/embeddings.
  • The request field is model, not model_id.
  • If your org uses app-scoped defaults, pass X-Octomil-App-Id from your application code.
  • For more detailed request examples, see Embeddings and Responses.