About

Meridian Labs is a 501(c)(3) non-profit building open source tools for testing, evaluating, and researching frontier AI models. Our goal is to advance AI safety, alignment, and security work by creating a common platform shared by governments, non-profits, academia, and model developers.

Our flagship project is Inspect AI, an evaluation framework built by Meridian’s founding team in collaboration with the UK AISI. Inspect AI is now the standard framework for government organizations including UK AISI, US CAISI, EU AI Office, Japan AISI, Korea AISI, and others. It is also the standard framework for research organizations including METR, Apollo, Epoch, SecureBio, Redwood, and RAND.

Our Work

Our open source projects include:

  1. Inspect AI, a frontier AI evaluation framework for conducting rigorous, repeatable assessments of AI capabilities and behaviors. The framework supports evaluations ranging from multiple-choice benchmarks to multi-agent tasks with tool use and sandboxing.

  2. Inspect Scout, a transcript analysis framework for AI agents. Scout supports LLM-based and pattern-based scanners for detecting issues like refusals, evaluation awareness, and environment errors. Currently in use at UK AISI, US CAISI, METR, and Apollo.

  3. Inspect Petri, an automated alignment auditing tool that orchestrates multi-turn interactions between auditor and target models. Petri originated at Anthropic and is now developed at Meridian.

  4. Inspect Flow, a configuration and workflow management tool for AI evaluations that enables systematic experimentation and running large-scale evaluation sets for auditing and pre-deployment testing. Currently in use at UK AISI and US CAISI.

Many of our users also work on control and mechanistic interpretability. Inspect AI underpins control eval projects like ControlArena and LinuxArena, and integrates with interpretability libraries like TransformerLens and nnterp.

Our Focus

We build open source tools that make frontier AI evaluation broadly available. Our focus includes:

  • Evolving our projects alongside the AI ecosystem, so teams have a current foundation for evaluating and understanding models.

  • Providing the infrastructure for AI Safety Institutes, research organizations, and lab safety teams to transfer their work into broadly-available open source projects.

  • Building tools for both human researchers and AI agents, as automated workflows for evaluation, monitoring, and alignment research become standard practice.

As AI capabilities advance, the evaluation and research infrastructure surrounding them must keep pace. Meridian exists to make sure it does.