Scenarios¶
Runnable, end-to-end examples that map to real adoption questions.
| Scenario | When to read |
|---|---|
| A simple end-to-end suite | First real example — the bundled quickstart, four cases. |
| Large suites & multi-file agents | Outgrowing one suite file; per-domain organization. |
| Edge cases | Empty output, exceptions, missing baselines, exotic inputs. |
| CI/CD integration | GitHub Actions, GitLab, CircleCI, Buildkite. |
| OpenAI / Anthropic SDK adapters | Skip manual instrumentation when on a supported SDK. |
| Performance & cost budgets | cost_lt_usd, latency_lt_ms, drift detection. |
| Debugging workflow | A failing case → root cause in five minutes. |
| Failure handling | Exception paths, judge unavailability, baseline corruption. |
Every scenario follows the same five-section shape: problem → input → code → output → explanation. Copy-paste any of them as a starting point.