Deployment intelligence platform that tracks every code deployment from commit to production. ML models baseline key health indicators and detect anomalies correlated with specific deployments, providing a Deploy Rating (A-F). Calculates DORA metrics with deployment-level granularity and correlates deployments with PagerDuty incidents for precise rollback targeting.
| Tier | Price | Includes |
|---|---|---|
Free | Free | Up to 200 deployments/month, basic metrics, 1 team |
Starter | Paid tier; full DORA metrics and AI skills | — |
Enterprise | Contact sales | — |
Sleuth grades each deployment A through F based on observability anomalies after ship.
Sleuth tracks every code deployment from commit to production, baselines health metrics from Datadog, Grafana, or New Relic, and detects anomalies correlated with specific deployments. It rolls those into a Deploy Rating from A to F, computes DORA metrics with deployment-level granularity, and ties PagerDuty incidents back to the exact deployment that caused them for precise rollback targeting.
Who it's for. Platform and engineering teams of 10 to 50 engineers who deploy frequently and want a quick read on whether each ship helped or hurt. Scenario: a deploy goes out at 2pm, error rate climbs 15 percent inside the deployment window, Sleuth assigns a D rating and surfaces the specific metric shift in the deploy feed, and the team rolls back within 10 minutes.
Tradeoffs. Quality depends entirely on the upstream observability stack; Sleuth reads metrics, it does not produce them. The A-F rubric is a deliberate simplification that some teams find useful and others find reductive. No free tier; pricing is per deployment target. Compared to Harness, Sleuth is observability-driven rather than pipeline-integrated.
Compare: Harness CD, OpsMx ISD, LaunchDarkly, Datadog Bits AI