Skip to content

MCP Builder

Turn an API into tools an agent can actually use.

An MCP server’s quality isn’t its endpoint count — it’s whether an LLM can finish a real task with it. A thin wrapper over every REST route often leaves the agent worse off than no tools at all. MCP Builder is a four-phase guide that designs tools around tasks, types every input and output, and proves the result with an evaluation suite before you ship.

AnthropicTypeScript SDKFastMCPZod / PydanticMCP Inspector

Four phases

Research & plan → Implement → Review & test → Evaluate
  • Research & plan — study the MCP spec and the target API; decide tool coverage.
  • Implement — shared API client, typed tools, response formatting, pagination.
  • Review & test — build, lint, and probe with the MCP Inspector.
  • Evaluate — 10 realistic questions that prove an LLM can do real work with the server.

Key concept — tools around tasks, not endpoints

The design question is never “what endpoints exist?” but “what will an agent try to do?”

  • Workflow tools bundle a multi-step task into one call; comprehensive coverage gives the agent room to compose. When uncertain, prefer coverage.
  • Discoverable names — consistent, action-oriented prefixes like github_create_issue, github_list_repos.
  • Actionable errors — every error names a likely cause and a next step, so the agent can recover instead of stalling.
  • Focused results — concise descriptions, filtering and pagination, so a tool call doesn’t flood the context window.

Worked example — a GitHub server

  1. Plan tools around tasks: github_create_issue, github_list_repos, github_search_code.
  2. Type the input with Zod (TS) or Pydantic (Python) — constraints, descriptions, an example per field.
  3. Annotate each tool: readOnlyHint, destructiveHint, idempotentHint, openWorldHint.
  4. Test with npx @modelcontextprotocol/inspector.
  5. Evaluate — write 10 read-only questions that each need several tool calls, solve them yourself, and store the answers for verification.

Under the hood

Per-tool checklist
ElementWhat to provide
Input schemaZod / Pydantic, with constraints and field-level examples
Output schemaDefine outputSchema; return structuredContent (TS SDK)
DescriptionConcise summary, parameter docs, return shape
ImplementationAsync I/O, pagination, actionable error messages
AnnotationsreadOnlyHint · destructiveHint · idempotentHint · openWorldHint
Recommended stack
  • Language — TypeScript: strong SDK, good in execution environments like MCPB, and models generate well-typed, lintable TS reliably. Python via FastMCP is fully supported too.
  • Transport — streamable HTTP with stateless JSON for remote servers (simpler to scale than stateful sessions); stdio for local servers.
  • Reference files ship with the skill: mcp_best_practices.md, node_mcp_server.md, python_mcp_server.md, evaluation.md — loaded only as needed.
Evaluation format

Ten questions, each independent · read-only · complex · realistic · verifiable · stable — a single answer that string-compares cleanly and won’t drift over time.

<evaluation>
<qa_pair>
<question>Find discussions about AI model launches with animal codenames. One needed an ASL-X safety designation. What number X was set for the model named after a spotted wild cat?</question>
<answer>3</answer>
</qa_pair>
</evaluation>