claude-code/README.md

# ClaudeCode - DSPy Module for Claude Code SDK

A DSPy module that wraps the Claude Code Python SDK with a signature-driven interface. Each agent instance maintains a stateful conversation session, making it perfect for multi-turn agentic workflows.

## Features

- **Signature-driven** - Use DSPy signatures for type safety and clarity
- **Stateful sessions** - Each agent instance = one conversation session
- **Smart schema handling** - Automatically handles str vs Pydantic outputs
- **Rich outputs** - Get typed results + execution trace + token usage
- **Multi-turn conversations** - Context preserved across calls
- **Enhanced prompts** - Automatically includes signature docstrings + InputField/OutputField descriptions for better context
- **Async support** - Both sync and async execution modes
- **Modaic Hub Integration** - Push and pull agents from Modaic Hub

## Installation

```bash
# Install with uv
uv add claude-agent-sdk dspy modaic nest-asyncio

# Or with pip
pip install claude-agent-sdk dspy modaic nest-asyncio
```

**Prerequisites:**
- Python 3.10+
- Claude Code CLI installed (get it from [code.claude.com](https://code.claude.com))
- Anthropic API key set in `ANTHROPIC_API_KEY` environment variable

## Quick Start with Modaic Hub (Recommended)

The fastest way to use ClaudeCode is to pull a pre-configured agent from Modaic Hub.

### 1. Set up environment

```bash
# Copy the example file
cp .env.example .env

# Edit .env with your keys
ANTHROPIC_API_KEY="<YOUR_ANTHROPIC_API_KEY>"
MODAIC_TOKEN="<YOUR_MODAIC_TOKEN>"  # Optional, for pushing to hub
```

### 2. Load from Modaic Hub

```python
from modaic import AutoProgram
from pydantic import BaseModel

class FileList(BaseModel):
    files: list[str]

# Load pre-compiled agent from hub
agent = AutoProgram.from_precompiled(
    "farouk1/claude-code",
    config={
        "signature": "message:str -> output:FileList",
        "working_directory": ".",
    }
)

# Use it!
result = agent(message="List Python files here")
print(result.output.files)  # Typed access
print(result.usage)         # Token usage
```

### 3. Override Config Options

```python
# Load with custom configuration
agent = AutoProgram.from_precompiled(
    "farouk1/claude-code",
    config={
        "signature": "message:str -> answer:str",
        "model": "claude-opus-4-5-20251101",
        "permission_mode": "acceptEdits",
        "allowed_tools": ["Read", "Write", "Bash"],
    }
)
```

## Local Development

For local development and creating your own agents:

### Basic String Output

```python
from claude_dspy import ClaudeCode, ClaudeCodeConfig

# Create config
config = ClaudeCodeConfig()

# Create agent
agent = ClaudeCode(
    config,
    signature="message:str -> answer:str",
    working_directory="."
)

# Use it
result = agent(message="What files are in this directory?")
print(result.answer)  # String response
print(result.trace)   # Execution items
print(result.usage)   # Token counts
```

### Structured Output with Pydantic

```python
from claude_dspy import ClaudeCode, ClaudeCodeConfig
from pydantic import BaseModel, Field

class BugReport(BaseModel):
    severity: str = Field(description="critical, high, medium, or low")
    description: str
    affected_files: list[str]

# Create config with Pydantic output
config = ClaudeCodeConfig()

agent = ClaudeCode(
    config,
    signature="message:str -> report:BugReport",
    working_directory="."
)

result = agent(message="Analyze the bug in error.log")
print(result.report.severity)       # Typed access!
print(result.report.affected_files)
```

### Push to Modaic Hub

```python
from claude_dspy import ClaudeCode, ClaudeCodeConfig

# Create your agent
config = ClaudeCodeConfig(model="claude-opus-4-5-20251101")

agent = ClaudeCode(
    config,
    signature="message:str -> answer:str",
    working_directory=".",
    permission_mode="acceptEdits",
)

# Test it locally
result = agent(message="Test my agent")
print(result.answer)

# Push to Modaic Hub
agent.push_to_hub("your-username/your-agent-name")
```

## API Reference

### ClaudeCodeConfig

Configuration object for ClaudeCode agents.

```python
class ClaudeCodeConfig:
    def __init__(
        self,
        model: str = "claude-opus-4-5-20251101",  # Default model
    )
```

**Parameters:**

- **`model`** - Claude model to use (default: `"claude-opus-4-5-20251101"`)

### ClaudeCode

Main agent class.

```python
class ClaudeCode(PrecompiledProgram):
    def __init__(
        self,
        config: ClaudeCodeConfig,
        signature: str | type[Signature],          # Required
        working_directory: str = ".",              # Default: "."
        permission_mode: str | None = None,        # Optional
        allowed_tools: list[str] | None = None,    # Optional
        disallowed_tools: list[str] | None = None, # Optional
        sandbox: dict[str, Any] | None = None,     # Optional
        system_prompt: str | dict | None = None,   # Optional
        api_key: str | None = None,                # Optional (uses env var)
    )
```

**Parameters:**

- **`config`** - ClaudeCodeConfig instance with model configuration
- **`signature`** (required) - DSPy signature defining input/output fields (must have exactly 1 input and 1 output)
- **`working_directory`** - Directory where Claude will execute commands (default: `"."`)
- **`permission_mode`** - Permission mode: `"default"`, `"acceptEdits"`, `"plan"`, `"bypassPermissions"`
- **`allowed_tools`** - List of allowed tool names (e.g., `["Read", "Write", "Bash"]`)
- **`disallowed_tools`** - List of disallowed tool names
- **`sandbox`** - Sandbox configuration dict
- **`system_prompt`** - Custom system prompt or preset config
- **`api_key`** - Anthropic API key (falls back to `ANTHROPIC_API_KEY` env var)

#### Methods

##### `__call__(**kwargs) -> Prediction` (or `forward`)

Execute the agent with an input message.

**Arguments:**
- `**kwargs` - Must contain the input field specified in signature

**Returns:**
- `Prediction` object with:
  - **Typed output field** - Named according to signature (e.g., `result.answer`)
  - **`trace`** - `list[TraceItem]` - Execution trace
  - **`usage`** - `Usage` - Token usage statistics

**Example:**
```python
config = ClaudeCodeConfig()
agent = ClaudeCode(
    config,
    signature="message:str -> answer:str",
    working_directory="."
)

result = agent(message="Hello")
print(result.answer)     # Access typed output
print(result.trace)      # List of execution items
print(result.usage)      # Token usage stats
```

##### `push_to_hub(repo_id: str) -> None`

Push the agent to Modaic Hub.

**Arguments:**
- `repo_id` - Repository ID in format "username/repo-name"

**Example:**
```python
agent.push_to_hub("your-username/your-agent")
```

##### `aforward(**kwargs) -> Prediction`

Async version of `__call__()` for use in async contexts.

**Example:**
```python
async def main():
    config = ClaudeCodeConfig()
    agent = ClaudeCode(
        config,
        signature="message:str -> answer:str",
        working_directory="."
    )
    result = await agent.aforward(message="Hello")
    print(result.answer)
```

#### Properties

##### `session_id: Optional[str]`

Get the session ID for this agent instance.

- Returns `None` until first call
- Persists across multiple calls
- Useful for debugging and logging

**Example:**
```python
config = ClaudeCodeConfig()
agent = ClaudeCode(
    config,
    signature="message:str -> answer:str",
    working_directory="."
)

print(agent.session_id)  # None

result = agent(message="Hello")
print(agent.session_id)  # 'eb1b2f39-e04c-4506-9398-b50053b1fd83'
```

##### `config: ClaudeCodeConfig`

Access to the agent's configuration.

```python
print(agent.config.model)  # 'claude-opus-4-5-20251101'
print(agent.config.working_directory)  # '.'
```

## Usage Patterns

### Pattern 1: Multi-turn Conversation

Each agent instance maintains a stateful session:

```python
from claude_dspy import ClaudeCode, ClaudeCodeConfig

config = ClaudeCodeConfig()
agent = ClaudeCode(
    config,
    signature="message:str -> answer:str",
    working_directory=".",
)

# Turn 1
result1 = agent(message="What's the main bug?")
print(result1.answer)

# Turn 2 - has context from Turn 1
result2 = agent(message="How do we fix it?")
print(result2.answer)

# Turn 3 - has context from Turn 1 + 2
result3 = agent(message="Write tests for the fix")
print(result3.answer)

# All use same session_id
print(agent.session_id)
```

### Pattern 2: Fresh Context

Want a new conversation? Create a new agent:

```python
from claude_dspy import ClaudeCode, ClaudeCodeConfig

config = ClaudeCodeConfig()

# Agent 1 - Task A
agent1 = ClaudeCode(
    config,
    signature="message:str -> answer:str",
    working_directory=".",
)
result1 = agent1(message="Analyze bug in module A")

# Agent 2 - Task B (no context from Agent 1)
agent2 = ClaudeCode(
    config,
    signature="message:str -> answer:str",
    working_directory=".",
)
result2 = agent2(message="Analyze bug in module B")
```

### Pattern 3: Field Descriptions for Enhanced Context

Enhance prompts with signature docstrings and field descriptions - all automatically included in the prompt:

```python
import dspy
from claude_dspy import ClaudeCode, ClaudeCodeConfig

class MySignature(dspy.Signature):
    """Analyze code architecture."""  # Used as task description

    message: str = dspy.InputField(
        desc="Request to process"  # Provides input context
    )
    analysis: str = dspy.OutputField(
        desc="A detailed markdown report with sections: "
        "1) Architecture overview, 2) Key components, 3) Dependencies"  # Guides output format
    )

config = ClaudeCodeConfig()
agent = ClaudeCode(
    config,
    signature=MySignature,
    working_directory=".",
)
result = agent(message="Analyze this codebase")

# The prompt sent to Claude will include:
# 1. Task: "Analyze code architecture." (from docstring)
# 2. Input context: "Request to process" (from InputField desc)
# 3. Your message: "Analyze this codebase"
# 4. Output guidance: "Please produce the following output: A detailed markdown report..." (from OutputField desc)
```

### Pattern 4: Inspecting Execution Trace

Access detailed execution information:

```python
from claude_dspy import ClaudeCode, ClaudeCodeConfig, ToolUseItem, ToolResultItem

config = ClaudeCodeConfig()
agent = ClaudeCode(
    config,
    signature="message:str -> answer:str",
    working_directory=".",
)

result = agent(message="Fix the bug")

# Filter trace by type
tool_uses = [item for item in result.trace if isinstance(item, ToolUseItem)]
for tool in tool_uses:
    print(f"Tool: {tool.tool_name}")
    print(f"Input: {tool.tool_input}")

tool_results = [item for item in result.trace if isinstance(item, ToolResultItem)]
for result_item in tool_results:
    print(f"Result: {result_item.content}")
    print(f"Error: {result_item.is_error}")
```

### Pattern 5: Token Usage Tracking

Monitor API usage:

```python
result = agent(message="...")

print(f"Input tokens: {result.usage.input_tokens}")
print(f"Cached tokens: {result.usage.cached_input_tokens}")
print(f"Output tokens: {result.usage.output_tokens}")
print(f"Total: {result.usage.total_tokens}")
```

### Pattern 6: Safe Execution with Permissions

Control what the agent can do:

```python
from claude_dspy import ClaudeCode, ClaudeCodeConfig

# Read-only (safest)
config = ClaudeCodeConfig()
agent = ClaudeCode(
    config,
    signature="message:str -> answer:str",
    working_directory=".",
    permission_mode="default",
    allowed_tools=["Read", "Glob", "Grep"],
)

# Auto-accept file edits
config = ClaudeCodeConfig(model="claude-opus-4-5-20251101")
agent = ClaudeCode(
    config,
    signature="message:str -> answer:str",
    working_directory=".",
    permission_mode="acceptEdits",
    allowed_tools=["Read", "Write", "Edit"],
)

# Sandbox mode for command execution
config = ClaudeCodeConfig()
agent = ClaudeCode(
    config,
    signature="message:str -> answer:str",
    working_directory=".",
    sandbox={"enabled": True},
)
```

## Advanced Examples

### Example 1: Code Review Agent

```python
from pydantic import BaseModel, Field
from claude_dspy import ClaudeCode, ClaudeCodeConfig

class CodeReview(BaseModel):
    summary: str = Field(description="High-level summary")
    issues: list[str] = Field(description="List of issues found")
    severity: str = Field(description="critical, high, medium, or low")
    recommendations: list[str] = Field(description="Actionable recommendations")

config = ClaudeCodeConfig(model="claude-opus-4-5-20251101")
agent = ClaudeCode(
    config,
    signature="message:str -> review:CodeReview",
    working_directory="/path/to/project",
    permission_mode="default",
    allowed_tools=["Read", "Glob", "Grep"],
)

result = agent(message="Review the changes in src/main.py")

print(f"Severity: {result.review.severity}")
for issue in result.review.issues:
    print(f"- {issue}")
```

### Example 2: Iterative Debugging

```python
from claude_dspy import ClaudeCode, ClaudeCodeConfig

config = ClaudeCodeConfig()
agent = ClaudeCode(
    config,
    signature="message:str -> response:str",
    working_directory=".",
    permission_mode="acceptEdits",
    allowed_tools=["Read", "Write", "Bash"],
)

# Turn 1: Find the bug
result1 = agent(message="Find the bug in src/calculator.py")
print(result1.response)

# Turn 2: Propose a fix
result2 = agent(message="What's the best way to fix it?")
print(result2.response)

# Turn 3: Implement the fix
result3 = agent(message="Implement the fix")
print(result3.response)

# Turn 4: Write tests
result4 = agent(message="Write tests for the fix")
print(result4.response)
```

### Example 3: Async Usage

```python
import asyncio
from claude_dspy import ClaudeCode, ClaudeCodeConfig

async def main():
    config = ClaudeCodeConfig()
    agent = ClaudeCode(
        config,
        signature="message:str -> answer:str",
        working_directory=".",
    )

    # Use aforward in async context
    result = await agent.aforward(message="Analyze this code")
    print(result.answer)

    # Cleanup
    await agent.disconnect()

asyncio.run(main())
```

## Trace Item Types

When accessing `result.trace`, you'll see various item types:

| Type | Fields | Description |
|------|--------|-------------|
| `AgentMessageItem` | `text`, `model` | Agent's text response |
| `ThinkingItem` | `text`, `model` | Agent's internal reasoning |
| `ToolUseItem` | `tool_name`, `tool_input`, `tool_use_id` | Tool invocation |
| `ToolResultItem` | `tool_name`, `tool_use_id`, `content`, `is_error` | Tool result |
| `ErrorItem` | `message`, `error_type` | Error that occurred |

## How It Works

### Signature <20> Claude Flow

```
1. Define signature: 'message:str -> answer:str'

2. ClaudeCode validates (must have 1 input, 1 output)

3. __init__ creates ClaudeSDKClient with options

4. forward(message="...") extracts message

5. If output field has desc <20> append to message

6. If output type ` str <20> generate JSON schema

7. Call client.query(message) with optional output_format

8. Iterate through receive_response(), collect messages

9. Parse response (JSON if Pydantic, str otherwise)

10. Return Prediction(output=..., trace=..., usage=...)
```

### Output Type Handling

**String output:**
```python
sig = dspy.Signature('message:str -> answer:str')
# No schema passed to Claude Code
# Response used as-is
```

**Pydantic output:**
```python
sig = dspy.Signature('message:str -> report:BugReport')
# JSON schema generated from BugReport
# Schema passed to Claude Code via output_format
# Response parsed with BugReport.model_validate_json()
```

### Prompt Building

ClaudeCode automatically builds rich prompts from your signature to provide maximum context to Claude:

```python
class MySignature(dspy.Signature):
    """Analyze code quality."""  # 1. Task description

    message: str = dspy.InputField(
        desc="Path to file or module"  # 2. Input context
    )
    report: str = dspy.OutputField(
        desc="Markdown report with issues and recommendations"  # 3. Output guidance
    )

config = ClaudeCodeConfig()
agent = ClaudeCode(
    config,
    signature=MySignature,
    working_directory="."
)
result = agent(message="Analyze src/main.py")  # 4. Your actual input
```

**The final prompt sent to Claude:**
```
Task: Analyze code quality.

Input context: Path to file or module

Analyze src/main.py

Please produce the following output: Markdown report with issues and recommendations
```

This automatic context enhancement helps Claude better understand:
- **What** the overall task is (docstring)
- **What** the input represents (InputField desc)
- **What** format the output should have (OutputField desc)

## Troubleshooting

### Error: "ClaudeCode requires exactly 1 input field"

Your signature has too many or too few fields. ClaudeCode expects exactly one input and one output:

```python
# L Wrong - multiple inputs
sig = dspy.Signature('context:str, question:str -> answer:str')

#  Correct - single input
sig = dspy.Signature('message:str -> answer:str')
```

### Error: "Failed to parse Claude response as MyModel"

The model returned JSON that doesn't match your Pydantic schema. Check:
1. Schema is valid and clear
2. Field descriptions are helpful
3. Model has enough context to generate correct structure

### Error: "Claude Code CLI not found"

Install Claude Code CLI:
```bash
# Visit code.claude.com for installation instructions
# or use npm:
npm install -g @anthropic-ai/claude-code
```

### Async event loop issues

Use `aforward()` when already in an async context:

```python
# L Don't do this in async context
async def main():
    result = agent(message="...")  # Can cause issues

#  Do this instead
async def main():
    result = await agent.aforward(message="...")
```

## Design Philosophy

### Why 1 input, 1 output?

ClaudeCode is designed for conversational agentic workflows. The input is always a message/prompt, and the output is always a response. This keeps the interface simple and predictable.

For complex inputs, compose them into the message:

```python
# Instead of: 'context:str, question:str -> answer:str'
message = f"Context: {context}\n\nQuestion: {question}"
result = agent(message=message)
```

### Why stateful sessions?

Agents often need multi-turn context (e.g., "fix the bug" <20> "write tests for it"). Stateful sessions make this natural without manual history management.

Want fresh context? Create a new agent instance.

### Why return trace + usage?

Observability is critical for agentic systems. You need to know:
- What tools were used
- What the agent was thinking
- How many tokens were consumed
- If any errors occurred

The trace provides full visibility into agent execution.

## Comparison with CodexAgent

| Feature | CodexAgent | ClaudeCode |
|---------|-----------|-------------|
| SDK | OpenAI Codex SDK | Claude Code Python SDK |
| Thread management | Built-in thread ID | Session-based (ClaudeSDKClient) |
| Streaming | Yes | Yes (via receive_response) |
| Async support | No | Yes (aforward) |
| Tool types | Codex-specific | Claude Code tools (Bash, Read, Write, etc.) |
| Sandbox | Simple mode enum | Detailed config dict |
| Permission control | Sandbox modes | Permission modes + allowed_tools |
| Configuration | Direct parameters | Config object (ClaudeCodeConfig) |

## Examples Directory

Check out the `examples/` directory for more:

- `basic_string_output.py` - Simple string output
- `pydantic_output.py` - Structured Pydantic output
- `multi_turn_conversation.py` - Multi-turn conversation
- `output_field_description.py` - Using output field descriptions
- `inspect_trace.py` - Inspecting execution trace
- `code_review_agent.py` - Advanced code review agent

## Contributing

Issues and PRs welcome! This is an implementation of Claude Code SDK integration with DSPy.

## License

See LICENSE file.

## Related Documentation

- [Claude Code SDK API Reference](https://docs.claude.com/en/agent-sdk/python)
- [DSPy Documentation](https://dspy-docs.vercel.app/)
- [Claude Code Documentation](https://code.claude.com/docs)

---

**Note:** This is a community implementation of Claude Code SDK integration with DSPy, inspired by the CodexAgent design pattern.