eval
📄️ Retrieve a previously-run quality evaluation by eval_id. Returns summary including overall_score and pass/fail verdict.
Retrieve a previously-run quality evaluation by eval_id. Returns summary including overall_score and pass/fail verdict.
📄️ List all past quality evaluation runs for a given model, newest first.
List all past quality evaluation runs for a given model, newest first.
📄️ Run a quality evaluation comparing on-device and cloud inference outputs. The evaluation is persisted and its eval_id ca
Run a quality evaluation comparing on-device and cloud inference outputs. The evaluation is persisted and its eval_id can be retrieved via eval.get.