Agent frameworks
Turn a brain into a team.
A model alone can think. An agent framework gives it tools, memory, and the ability to delegate — turning a single brain into a whole workforce.
PiAgent frameworks
Earendil WorksFramework comparison
| Framework | Interaction | License | Multi-model | Agent teams | Free tier |
|---|---|---|---|---|---|
| Interactive CLI + API | Proprietary | Claude only | Subagents + hooks | No | |
Pi | Interactive terminal | MIT | 10+ providers | Via extensions | Own API key |
| Async cloud | Proprietary | GPT-5 only | Parallel sandboxes | Limited | |
| Interactive terminal | Apache 2.0 | Gemini only | Limited | 1,000 req/day | |
| Build-your-own | Apache 2.0 | Any model | Graph orchestration | Framework free |
SWE-bench performance
SWE-bench Verified
Terminal-Bench 2.0
tokens (lower = leaner)

SWE-bench and Terminal-Bench scores reflect the underlying model (Claude Opus 4.7, GPT-5.5, Gemini 2.5 Pro). Pi’s score is model-dependent — it runs any provider. System prompt sizes: Pi <1,000 per mariozechner.at (Nov 2025); Claude Code ~10,000 per same source. Sources: Anthropic system card · Pi blog post.
Pick the right framework
Agent teams that need compliance, audit trails, and subagent control
Use Claude Code + Agent SDK — first-class subagent support with permissioned tool sets, PreToolUse/PostToolUse hooks for change-control logging, plan mode (read-only), and session branching. The only framework with a programmatic API that mirrors the interactive CLI exactly. Best for regulated, multi-agent workflows where you need full observability.
Background task queues — fire tasks, collect PRs
Use Codex — the only framework built for async, parallel cloud execution. Submit 20 tasks simultaneously across different repos; each runs in an isolated sandbox and returns a PR. RL-trained on real software engineering tasks to produce clean, human-style diffs. Best when you want agents working in the background while the team does other things.
Cost-sensitive, local, or air-gapped deployments
Use Pi — MIT licensed, model-agnostic (Anthropic, OpenAI, Gemini, Mistral, Ollama, and more), and a <1,000 token system prompt means ~10× less context overhead than alternatives. Fork it, embed it, run it fully offline with Ollama. Best when data sovereignty matters, budget is tight, or you need full control over the framework itself.
Custom multi-agent systems on Google Cloud
Use Google ADK — a full graph-based agent orchestration framework (Python, TypeScript, Go, Java) with 100+ enterprise connectors (SAP, Salesforce, Workday, BigQuery), an A2A (agent-to-agent) communication protocol, and native Vertex AI deployment. Best when building bespoke agent workflows on GCP with existing enterprise system integrations.