Meridian Labs Blog – Meridian Labs

Petri joins Meridian Labs, with a major architecture overhaul focused on hackability. Dish and Bloom extensions provide new capabilities.

A new package adding managed cloud sandbox providers to Inspect, so you can run evals at scale without provisioning your own infrastructure.

New tools for long-horizon agents including deep agents, timelines, compaction, checkpointing, and bridges for Claude Code and Codex CLI.

A workflow layer for Inspect that makes it easier to run evals at scale with declarative configs, matrix sweeps, and automatic log reuse.

A tool for in-depth analysis of AI agent transcripts, with LLM-based and pattern-based scanners for detecting issues beyond simple metrics.

A new package that makes 80+ Harbor benchmarks including SWE-Bench Pro, Terminal-Bench, and more available to run in Inspect.

A new data visualization framework for Inspect evals, featuring pre-built plots for commonly used views of evaluation data.