Compute and return a routing decision. Given model parameters and device capabilities, returns the optimal execution tar
POST/api/v1/route
Compute and return a routing decision. Given model parameters and device capabilities, returns the optimal execution target (device vs cloud), format, engine, and quantization. When deployment_id is supplied the deployment's serving_policy overrides the prefer hint; an active serving_policy experiment takes precedence over the deployment policy.
Request
Responses
- 200
- default
Success
Error response