Claude Code Agent SDK: Build Custom AI Coding Agents in 2026

Disclosure: RunAICode.ai may earn a commission when you purchase through links on this page. This doesn’t affect our reviews or rankings. We only recommend tools we’ve tested and believe in. Learn more.

This article may contain affiliate links. We may earn a small commission if you purchase through our links, at no extra cost to you. We only recommend tools we’ve personally tested.

Affiliate Disclosure: Some links in this article are affiliate links. We may earn a commission at no extra cost to you if you make a purchase through them. We only recommend tools we actually use.

# Claude Code Agent SDK: Build Custom AI Coding Agents in 2026

The era of manually writing boilerplate, reviewing PRs line-by-line, and generating test suites by hand is ending. With the Claude Code Agent SDK, developers can build autonomous AI coding agents that handle these tasks programmatically, using the same model that powers Claude Code itself.

This guide walks you through everything: from understanding the SDK architecture to building real-world agents for code review and test generation. Whether you want to automate your CI pipeline or build an internal dev tool, this is where you start.

What Is the Claude Code Agent SDK?

The Claude Code Agent SDK is Anthropic’s official toolkit for building custom AI agents on top of Claude. Available in both Python and TypeScript, the SDK provides structured primitives for creating agents that can reason, use tools, delegate to sub-agents, and operate in autonomous loops.

Unlike calling the Claude API directly and parsing raw text responses, the Agent SDK gives you:

Structured agent definitions with typed tool schemas
Conversation turn management with automatic context handling
Sub-agent delegation for breaking complex tasks into specialized steps
Built-in tool use including file system access, shell execution, and custom tools
Human-in-the-loop controls for gating dangerous operations
MCP (Model Context Protocol) integration for connecting to external services

Think of it this way: the Claude API gives you a brain. The Agent SDK gives you a brain with hands, eyes, and a plan.

If you are new to Claude Code itself, start with our complete Claude Code guide before diving into the SDK.

Architecture Overview

The SDK follows a layered architecture:

+------------------------------------------+
|            Your Application               |
+------------------------------------------+
|          Agent SDK (Python/TS)            |
<div style="overflow-x: auto;"><table style="width: 100%; border-collapse: collapse; margin: 20px 0;">
<thead><tr><th style="padding: 12px; border: 1px solid rgba(255,255,255,0.1); background: rgba(79,140,255,0.1); text-align: left;">+--------+  +-------+  +-----------+</th></tr></thead>
<tbody>
<tr><td style="padding: 10px; border: 1px solid rgba(255,255,255,0.1);">+--------+  +-------+  +-----------+</td></tr>
</tbody></table></div>
+------------------------------------------+
|        Claude API (Messages API)          |
+------------------------------------------+
|    MCP Servers  |  External Services      |
+------------------------------------------+

Core Concepts

Concept	Description
Agent	A configured Claude instance with a system prompt, tools, and behavior rules
Tool	A function the agent can call (file read, shell exec, API call, custom logic)
Sub-agent	A specialized agent that a parent agent can delegate specific tasks to
Conversation Turn	One cycle of: user message -> model reasoning -> tool calls -> result
MCP Server	An external service that exposes tools and resources via the Model Context Protocol
Human-in-the-loop	A checkpoint where the agent pauses for human approval before proceeding

Getting Started

Prerequisites

Python 3.10+ or Node.js 18+
An Anthropic API key (get one here)
Basic familiarity with async programming

Installation

Python:

pip install anthropic-agent-sdk

TypeScript:

npm install @anthropic-ai/agent-sdk

API Key Setup

Set your API key as an environment variable:

export ANTHROPIC_API_KEY="sk-ant-your-key-here"

Or pass it directly when initializing the SDK (not recommended for production).

Security note: When developing agents that handle API keys and access remote services, protect your development environment too. If you work from multiple locations or public networks, a VPN like NordVPN prevents credential interception during development and testing.

Your First Agent

Here is the simplest possible agent — one that answers questions about code:

Python:

from anthropic_agent_sdk import Agent, tool

@tool
def read_file(path: str) -> str:
"""Read a file from the local filesystem."""
with open(path, "r") as f:
return f.read()

agent = Agent(
name="code-helper",
model="claude-sonnet-4-20250514",
system_prompt="You are a helpful coding assistant. Use the read_file tool to examine code when asked.",
tools=[read_file],
)

response = agent.run("What does the main function do in ./src/app.py?")
print(response.text)

TypeScript:

import { Agent, tool } from "@anthropic-ai/agent-sdk";
import { readFileSync } from "fs";

const readFile = tool({
name: "read_file",
description: "Read a file from the local filesystem.",
parameters: { path: { type: "string", description: "File path to read" } },
execute: async ({ path }) => readFileSync(path, "utf-8"),
});

const agent = new Agent({
name: "code-helper",
model: "claude-sonnet-4-20250514",
systemPrompt: "You are a helpful coding assistant. Use the read_file tool to examine code when asked.",
tools: [readFile],
});

const response = await agent.run("What does the main function do in ./src/app.py?");
console.log(response.text);

That is a functional agent in under 20 lines. Now let’s build something real.

Building a Code Review Agent

This step-by-step tutorial builds an agent that reviews pull request diffs and provides structured feedback.

Step 1: Define the Tools

The agent needs to read diffs, access file contents, and post comments:

from anthropic_agent_sdk import Agent, tool
import subprocess

@tool
def get_git_diff(base_branch: str = "main") -> str:
"""Get the git diff between the current branch and the base branch."""
result = subprocess.run(
["git", "diff", f"{base_branch}...HEAD"],
capture_output=True, text=True
)
return result.stdout

@tool
def read_file(path: str) -> str:
"""Read a source file to understand full context around a change."""
with open(path, "r") as f:
return f.read()

@tool
def list_changed_files(base_branch: str = "main") -> str:
"""List all files changed relative to the base branch."""
result = subprocess.run(
["git", "diff", "--name-only", f"{base_branch}...HEAD"],
capture_output=True, text=True
)
return result.stdout

Step 2: Configure the Agent

code_reviewer = Agent(
name="code-reviewer",
model="claude-sonnet-4-20250514",
system_prompt="""You are a senior software engineer performing a code review.

For each changed file:

Read the full diff with get_git_diff
Read the complete file with read_file for context
Identify issues in these categories:


Bugs and logic errors
Security vulnerabilities
Performance concerns
Code style and readability
Missing error handling
Missing tests


Output a structured review with severity levels (critical/warning/suggestion)
and specific line references.""",
tools=[get_git_diff, read_file, list_changed_files],
max_turns=20,  # Allow enough turns for thorough review
)

Step 3: Run the Review

review = code_reviewer.run(
"Review all changes on this branch against main. "
"Focus on bugs, security issues, and missing error handling."
)

print(review.text)

Step 4: Integrate with GitHub

To post review comments back to a PR, add a GitHub tool:

import requests
import os

@tool
def post_pr_comment(pr_number: int, body: str) -> str:
"""Post a review comment on a GitHub pull request."""
token = os.environ["GITHUB_TOKEN"]
repo = os.environ["GITHUB_REPOSITORY"]
resp = requests.post(
f"https://api.github.com/repos/{repo}/pulls/{pr_number}/reviews",
headers={"Authorization": f"Bearer {token}"},
json={"body": body, "event": "COMMENT"},
)
return f"Comment posted: {resp.status_code}"

Add this tool to the agent’s tool list and it can autonomously post reviews to your PRs.

Building a Test Generation Agent

Automated test generation is one of the highest-value applications for AI coding agents. Here is a complete implementation.

For a broader look at AI-powered testing, see our guide on AI-powered test generation.

from anthropic_agent_sdk import Agent, tool
import os

@tool
def read_file(path: str) -> str:
"""Read a source file."""
with open(path, "r") as f:
return f.read()

@tool
def write_file(path: str, content: str) -> str:
"""Write content to a file."""
os.makedirs(os.path.dirname(path), exist_ok=True)
with open(path, "w") as f:
f.write(content)
return f"Written to {path}"

@tool
def run_tests(test_path: str) -> str:
"""Run tests and return the output."""
import subprocess
result = subprocess.run(
["pytest", test_path, "-v", "--tb=short"],
capture_output=True, text=True, timeout=60
)
return f"Exit code: {result.returncode}\n\nSTDOUT:\n{result.stdout}\n\nSTDERR:\n{result.stderr}"

@tool
def list_directory(path: str) -> str:
"""List files in a directory."""
entries = os.listdir(path)
return "\n".join(entries)

test_generator = Agent(
name="test-generator",
model="claude-sonnet-4-20250514",
system_prompt="""You are a test engineering expert. Given a source file:


Read and understand the source code
Identify all public functions, classes, and methods
Generate comprehensive tests covering:


Happy path cases
Edge cases (empty inputs, None values, boundary conditions)
Error cases (invalid inputs, exceptions)
Integration points (mocked external dependencies)


Write the test file using pytest conventions
Run the tests to verify they pass
If tests fail, read the error output, fix the tests, and re-run


Continue iterating until all tests pass. Use descriptive test names
that explain what is being tested.""",
tools=[read_file, write_file, run_tests, list_directory],
max_turns=30,
)

# Generate tests for a specific module
result = test_generator.run(
"Generate comprehensive unit tests for ./src/services/payment.py. "
"Save them to ./tests/test_payment.py."
)

The key here is the autonomous loop: the agent writes tests, runs them, reads failures, fixes issues, and repeats until the suite passes. This is the pattern that makes AI agents fundamentally different from one-shot API calls.

Agent Patterns

Pattern 1: Autonomous Loop

The agent works independently until the task is complete. Best for well-defined tasks with clear success criteria.

autonomous_agent = Agent(
name="auto-fixer",
model="claude-sonnet-4-20250514",
system_prompt="Fix all linting errors. Run the linter after each fix. Stop when zero errors remain.",
tools=[read_file, write_file, run_shell_command],
max_turns=50,
)

Pattern 2: Human-in-the-Loop

The agent pauses at critical checkpoints for human approval. Essential for production deployments and destructive operations.

from anthropic_agent_sdk import Agent, HumanApproval

agent = Agent(
name="db-migrator",
model="claude-sonnet-4-20250514",
system_prompt="Generate and apply database migrations.",
tools=[read_schema, generate_migration, apply_migration],
approval_policy=HumanApproval(
require_approval_for=["apply_migration"]  # Pause before destructive ops
),
)

Pattern 3: Multi-Agent Delegation

A coordinator agent delegates subtasks to specialized sub-agents. This is powerful for complex workflows.

reviewer = Agent(
name="reviewer",
model="claude-sonnet-4-20250514",
system_prompt="You review code for bugs and security issues.",
tools=[read_file, get_git_diff],
)

test_writer = Agent(
name="test-writer",
model="claude-sonnet-4-20250514",
system_prompt="You write unit tests for changed code.",
tools=[read_file, write_file, run_tests],
)

coordinator = Agent(
name="pr-pipeline",
model="claude-sonnet-4-20250514",
system_prompt="""You coordinate PR review:

Delegate code review to the reviewer sub-agent
Delegate test generation to the test-writer sub-agent
Combine their outputs into a final summary""",

sub_agents=[reviewer, test_writer],
)

result = coordinator.run("Process the PR on this branch.")

Integrating with Existing Tools

GitHub Actions

Run your agent as a CI step:

# .github/workflows/ai-review.yml
name: AI Code Review
on: [pull_request]

jobs:
review:
runs-on: ubuntu-latest
steps:

uses: actions/checkout@v4

with:
fetch-depth: 0

uses: actions/setup-python@v5

with:
python-version: "3.12"

run: pip install anthropic-agent-sdk
run: python scripts/run_review_agent.py

env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Jira Integration

Add a tool that creates Jira tickets from agent findings:

@tool
def create_jira_ticket(summary: str, description: str, priority: str = "Medium") -> str:
"""Create a Jira ticket for a code issue found during review."""
from jira import JIRA
jira = JIRA(server=os.environ["JIRA_URL"], basic_auth=(
os.environ["JIRA_USER"], os.environ["JIRA_TOKEN"]
))
issue = jira.create_issue(
project="DEV",
summary=summary,
description=description,
issuetype={"name": "Bug"},
priority={"name": priority},
)
return f"Created {issue.key}: {summary}"

CI/CD Pipeline Integration

Agents work well as quality gates in your deployment pipeline. Run a security review agent before merge, a test generation agent on new modules, and a documentation agent after feature branches land.

MCP Server Integration

The Model Context Protocol (MCP) lets your agents connect to external services through a standardized interface. Instead of building custom tools for every service, you can connect to MCP servers that expose tools automatically.

For a deep dive on building your own MCP servers, see our guide to building MCP servers.

from anthropic_agent_sdk import Agent, MCPServer

# Connect to an MCP server that provides database tools
db_server = MCPServer(
name="postgres-mcp",
command="npx",
args=["-y", "@anthropic-ai/mcp-server-postgres", os.environ["DATABASE_URL"]],
)

# Connect to a GitHub MCP server
github_server = MCPServer(
name="github-mcp",
command="npx",
args=["-y", "@anthropic-ai/mcp-server-github"],
env={"GITHUB_TOKEN": os.environ["GITHUB_TOKEN"]},
)

agent = Agent(
name="full-stack-agent",
model="claude-sonnet-4-20250514",
system_prompt="You can query the database and interact with GitHub repositories.",
mcp_servers=[db_server, github_server],
)

The agent automatically discovers all tools exposed by the MCP servers and can use them alongside its built-in tools.

TypeScript MCP Example

import { Agent, MCPServer } from "@anthropic-ai/agent-sdk";

const filesystemServer = new MCPServer({
name: "filesystem",
command: "npx",
args: ["-y", "@anthropic-ai/mcp-server-filesystem", "/path/to/project"],
});

const agent = new Agent({
name: "fs-agent",
model: "claude-sonnet-4-20250514",
systemPrompt: "You help users explore and understand codebases.",
mcpServers: [filesystemServer],
});

Deployment Options

Local Development

Run agents directly on your machine. Best for prototyping and personal automation:

python run_agent.py

Cloud Deployment

Deploy as an API service behind your existing infrastructure. A minimal FastAPI wrapper:

from fastapi import FastAPI
from anthropic_agent_sdk import Agent

app = FastAPI()
agent = Agent(name="api-agent", model="claude-sonnet-4-20250514", ...)

@app.post("/review")
async def review(payload: dict):
result = agent.run(payload["prompt"])
return {"review": result.text, "turns": result.turn_count}

CI Pipeline

The most practical deployment for many teams. Run agents as CI jobs triggered by PRs, pushes, or schedules. See the GitHub Actions example above.

Choosing the Right Deployment

Approach	Best For	Latency	Cost
Local	Personal dev tools, prototyping	Low	API calls only
Cloud (API)	Team-wide tools, internal platforms	Medium	API + hosting
CI Pipeline	Automated review, testing, quality gates	Higher	API + CI minutes

Pricing Considerations

Agent SDK usage is billed through the standard Claude API pricing. For a full breakdown of Claude Code subscription tiers and how they compare, see our Claude Code pricing guide. Key factors:

Input tokens: Your system prompt + conversation history + tool results. A thorough code review agent might consume 50,000-100,000 input tokens per run.
Output tokens: The model’s reasoning and tool calls. Expect 5,000-20,000 output tokens per agent run.
Turns: More turns = more tokens. An autonomous agent running 20 turns costs roughly 10-20x a single API call.
Model choice: Claude Sonnet is the sweet spot for most agent tasks (fast, capable, cost-effective). Use Claude Opus for tasks requiring deep reasoning. Use Claude Haiku for high-volume, simpler tasks.

Cost Optimization Tips

Use Sonnet for most agents — it handles code tasks well at a fraction of Opus pricing
Set max_turns conservatively to prevent runaway agents
Cache tool results when the same file is read multiple times
Use prompt caching for system prompts that do not change between runs
Filter diffs before sending to the agent — exclude lock files, generated code, and binary changes

A typical code review agent run costs $0.05-0.30 depending on diff size and model choice. At scale, budget roughly $50-150/month for a team of 10 developers running reviews on every PR.

Best Practices

Security

Never hardcode API keys in agent code. Use environment variables or a secrets manager.
Sandbox shell execution tools. Limit what commands the agent can run.
Use human-in-the-loop for any tool that modifies production systems.
Audit agent actions by logging all tool calls and results.

Reliability

Set timeouts on all tools, especially shell commands and API calls.
Implement retry logic for transient failures.
Use max_turns to prevent infinite loops.
Test agents with known inputs before deploying to CI.

Performance

Keep system prompts focused. A 500-token prompt works better than a 5,000-token prompt.
Use sub-agents for distinct tasks rather than one mega-agent.
Pre-filter inputs (e.g., exclude irrelevant files from diffs).
Cache expensive tool results within a session.

—

Frequently Asked Questions

What is the Claude Code Agent SDK?

The Claude Code Agent SDK is Anthropic’s official Python and TypeScript library for building custom AI agents powered by Claude. It provides structured primitives for defining agents with tools, sub-agents, conversation management, and human-in-the-loop controls, enabling developers to build autonomous coding workflows.

How is the Agent SDK different from the Claude API?

The Claude API provides raw access to Claude’s language capabilities. The Agent SDK builds on top of the API to add agent-specific features: structured tool definitions, automatic conversation turn management, sub-agent delegation, MCP server integration, and approval policies. Think of the API as the engine and the SDK as the complete vehicle.

Which programming languages are supported?

The Agent SDK is officially available in Python and TypeScript. Both implementations have feature parity. Python is more common in data science and DevOps contexts, while TypeScript is popular for web-based developer tools.

How much does it cost to run an AI coding agent?

Costs depend on model choice, input size, and number of turns. A typical code review agent run on Claude Sonnet costs $0.05-0.30. For a team of 10 developers running reviews on every PR, expect $50-150/month. Use prompt caching and input filtering to reduce costs.

Can I use the Agent SDK with MCP servers?

Yes. MCP (Model Context Protocol) integration is a first-class feature. You can connect agents to any MCP server — database, GitHub, filesystem, or custom services — and the agent automatically discovers and uses the exposed tools.

Is the Agent SDK suitable for production use?

Yes, with appropriate guardrails. Use human-in-the-loop approval for destructive operations, set max_turns limits, implement logging and monitoring, and sandbox shell execution. Many teams run Agent SDK-powered tools in CI pipelines today.

Can agents call other agents?

Yes. The SDK supports sub-agent delegation, where a coordinator agent delegates specialized tasks to child agents. This is useful for complex workflows like PR processing, where you might have separate agents for code review, test generation, and documentation updates.

What models work with the Agent SDK?

The SDK works with all Claude models: Opus (deepest reasoning), Sonnet (best balance of speed and capability), and Haiku (fastest, most affordable). Most coding agents perform well on Sonnet.

—

Last updated: March 24, 2026. For reusable prompt templates that do not require code, see our Claude Code skills guide. To see how the Agent SDK fits into the broader ecosystem, read our roundup of agentic AI coding tools in 2026. Have questions about building AI coding agents? Join the RunAICode Discord for help and discussion.