What AgentOps does¶
AgentOps turns Foundry evaluation, safety, and observability signals into a repeatable ship/no-ship workflow. It connects Foundry Evaluations, the ASSERT safety framework, the PyRIT-backed AI Red Teaming agent, Azure Monitor, and your CI/CD platform into one release loop, packaging every result into a stable evidence pack that proves a release is ready for production.
New here? Start with the Prompt Agent tutorial or the HTTP Agent tutorial to learn the sandbox to dev PR gate flow end to end.
Evaluate¶
Read Evaluation to learn how datasets, evaluators, thresholds, and rubrics turn an agent into a pass or fail gate.
Ship¶
Ship explains the generated PR gate and dev deploy workflows, and how candidate versions become a release.
Reference architecture¶
Use this as the mental model for the AgentOps loop: build and learn in a sandbox, commit the release contract to source control, promote through environments with evidence, then feed production learning back into the next evaluation set.

| Area | What it owns |
|---|---|
| Sandbox inner loop | Create, evaluate, and improve the candidate agent in a safe Foundry project before it is promoted. |
| AgentOps Accelerator | Keep release readiness close to the repo: config, datasets, evaluation gates, Doctor diagnostics, Cockpit views, CI workflows, thresholds, and release evidence. |
| Foundry | Own managed agent projects, Prompt Agent and HTTP agent runtime options, traces, operate views, guardrails, and evaluations where applicable. |
| Outer loop delivery | Move the same reviewed candidate through dev, QA or staging, and production. Production release should be gated by reviewable evidence, not memory or a manual spot check. |
| Operate and improve | Watch telemetry, dashboards, alerts, cost, success rate, compliance, quota, security posture, and data governance. Turn production traces into the next regression cases. |
Contributing¶
Contributions are welcome. See the project repository for guidelines, issues, and the contribution process.
