Skip to main content

Compute and return a routing decision. Given model parameters and device capabilities, returns the optimal execution tar

POST 

/api/v1/route

Compute and return a routing decision. Given model parameters and device capabilities, returns the optimal execution target (device vs cloud), format, engine, and quantization. When deployment_id is supplied the deployment's serving_policy overrides the prefer hint; an active serving_policy experiment takes precedence over the deployment policy.

Request

Responses

Success