Open-source framework for systematic evaluation of large language models, built by the UK AI Safety Institute with Meridian Labs as contributors.
In-depth analysis of AI agent transcripts with high-performance parallel scanning and rich visualization of results.
Workflow orchestration for Inspect AI that enables running evaluations at scale with repeatability and maintainability.
Visual Studio Code extension for productive use of Inspect AI with an integrated log viewer, task browser, and debugging tools.
Makes software engineering agents like Claude Code and Codex CLI available as standard Inspect Agents for evaluation.
Data visualization library for creating high-quality interactive visualizations from Inspect AI evaluation results.