(no commit message)

2025-12-05 13:22:15 -05:00
parent 9b0337d1b8
commit 722e5aba21
6 changed files with 578 additions and 40 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,566 @@
+# ClaudeAgent - DSPy Module for Claude Code SDK
+
+A DSPy module that wraps the Claude Code Python SDK with a signature-driven interface. Each agent instance maintains a stateful conversation session, making it perfect for multi-turn agentic workflows.
+
+## Features
+
+- **Signature-driven** - Use DSPy signatures for type safety and clarity
+- **Stateful sessions** - Each agent instance = one conversation session
+- **Smart schema handling** - Automatically handles str vs Pydantic outputs
+- **Rich outputs** - Get typed results + execution trace + token usage
+- **Multi-turn conversations** - Context preserved across calls
+- **Output field descriptions** - Automatically enhance prompts
+- **Async support** - Both sync and async execution modes
+
+## Installation
+
+```bash
+# Install with uv
+uv add claude-agent-sdk dspy nest-asyncio
+
+# Or with pip
+pip install claude-agent-sdk dspy nest-asyncio
+```
+
+**Prerequisites:**
+- Python 3.10+
+- Claude Code CLI installed (get it from [code.claude.com](https://code.claude.com))
+- Anthropic API key set in `ANTHROPIC_API_KEY` environment variable
+
+## Quick Start
+
+### Basic String Output
+
+```python
+import dspy
+from claude_agent import ClaudeAgent
+
+# Define signature
+sig = dspy.Signature('message:str -> answer:str')
+
+# Create agent
+agent = ClaudeAgent(sig, working_directory=".")
+
+# Use it
+result = agent(message="What files are in this directory?")
+print(result.answer)  # String response
+print(result.trace)   # Execution items
+print(result.usage)   # Token counts
+```
+
+### Structured Output with Pydantic
+
+```python
+from pydantic import BaseModel, Field
+
+class BugReport(BaseModel):
+    severity: str = Field(description="critical, high, medium, or low")
+    description: str
+    affected_files: list[str]
+
+sig = dspy.Signature('message:str -> report:BugReport')
+agent = ClaudeAgent(sig, working_directory=".")
+
+result = agent(message="Analyze the bug in error.log")
+print(result.report.severity)  # Typed access!
+print(result.report.affected_files)
+```
+
+## API Reference
+
+### ClaudeAgent
+
+```python
+class ClaudeAgent(dspy.Module):
+    def __init__(
+        self,
+        signature: str | type[Signature],
+        working_directory: str,
+        model: Optional[str] = None,
+        permission_mode: Optional[str] = None,
+        allowed_tools: Optional[list[str]] = None,
+        disallowed_tools: Optional[list[str]] = None,
+        sandbox: Optional[dict[str, Any]] = None,
+        system_prompt: Optional[str | dict[str, Any]] = None,
+        api_key: Optional[str] = None,
+        **kwargs: Any,
+    )
+```
+
+#### Parameters
+
+**Required:**
+
+- **`signature`** (`str | type[Signature]`)
+  - DSPy signature defining input/output fields
+  - Must have exactly 1 input field and 1 output field
+  - Examples:
+    - String format: `'message:str -> answer:str'`
+    - Class format: `MySignature` (subclass of `dspy.Signature`)
+
+- **`working_directory`** (`str`)
+  - Directory where Claude will execute commands
+  - Example: `"."`, `"/path/to/project"`
+
+**Optional:**
+
+- **`model`** (`Optional[str]`)
+  - Model to use: `"sonnet"`, `"opus"`, `"haiku"`
+  - Default: Claude Code default (typically Sonnet)
+
+- **`permission_mode`** (`Optional[str]`)
+  - Controls permission behavior:
+    - `"default"` - Standard permission checks
+    - `"acceptEdits"` - Auto-accept file edits
+    - `"plan"` - Planning mode (no execution)
+    - `"bypassPermissions"` - Bypass all checks (use with caution!)
+  - Default: `"default"`
+
+- **`allowed_tools`** (`Optional[list[str]]`)
+  - List of allowed tool names
+  - Examples: `["Read", "Write", "Bash", "Glob"]`
+  - Default: All tools allowed
+
+- **`disallowed_tools`** (`Optional[list[str]]`)
+  - List of disallowed tool names
+  - Default: `[]`
+
+- **`sandbox`** (`Optional[dict[str, Any]]`)
+  - Sandbox configuration for command execution
+  - Example: `{"enabled": True, "network": {"allowLocalBinding": True}}`
+  - Default: `None`
+
+- **`system_prompt`** (`Optional[str | dict[str, Any]]`)
+  - Custom system prompt or preset configuration
+  - String: Custom prompt
+  - Dict: Preset config like `{"type": "preset", "preset": "claude_code", "append": "..."}`
+  - Default: `None` (uses Claude Code default)
+
+- **`api_key`** (`Optional[str]`)
+  - Anthropic API key
+  - Falls back to `ANTHROPIC_API_KEY` environment variable
+  - Default: `None`
+
+- **`**kwargs`** - Additional `ClaudeAgentOptions` parameters
+
+#### Methods
+
+##### `forward(**kwargs) -> Prediction`
+
+Execute the agent with an input message.
+
+**Arguments:**
+- `**kwargs` - Must contain the input field specified in signature
+
+**Returns:**
+- `Prediction` object with:
+  - **Typed output field** - Named according to signature (e.g., `result.answer`)
+  - **`trace`** - `list[TraceItem]` - Execution trace
+  - **`usage`** - `Usage` - Token usage statistics
+
+**Example:**
+```python
+result = agent(message="Hello")
+print(result.answer)     # Access typed output
+print(result.trace)      # List of execution items
+print(result.usage)      # Token usage stats
+```
+
+##### `aforward(**kwargs) -> Prediction`
+
+Async version of `forward()` for use in async contexts.
+
+**Example:**
+```python
+async def main():
+    result = await agent.aforward(message="Hello")
+    print(result.answer)
+```
+
+#### Properties
+
+##### `session_id: Optional[str]`
+
+Get the session ID for this agent instance.
+
+- Returns `None` until first `forward()` call
+- Persists across multiple `forward()` calls
+- Useful for debugging and logging
+
+**Example:**
+```python
+agent = ClaudeAgent(sig, working_directory=".")
+print(agent.session_id)  # None
+
+result = agent(message="Hello")
+print(agent.session_id)  # '0199e95f-2689-7501-a73d-038d77dd7320'
+```
+
+## Usage Patterns
+
+### Pattern 1: Multi-turn Conversation
+
+Each agent instance maintains a stateful session:
+
+```python
+agent = ClaudeAgent(sig, working_directory=".")
+
+# Turn 1
+result1 = agent(message="What's the main bug?")
+print(result1.answer)
+
+# Turn 2 - has context from Turn 1
+result2 = agent(message="How do we fix it?")
+print(result2.answer)
+
+# Turn 3 - has context from Turn 1 + 2
+result3 = agent(message="Write tests for the fix")
+print(result3.answer)
+
+# All use same session_id
+print(agent.session_id)
+```
+
+### Pattern 2: Fresh Context
+
+Want a new conversation? Create a new agent:
+
+```python
+# Agent 1 - Task A
+agent1 = ClaudeAgent(sig, working_directory=".")
+result1 = agent1(message="Analyze bug in module A")
+
+# Agent 2 - Task B (no context from Agent 1)
+agent2 = ClaudeAgent(sig, working_directory=".")
+result2 = agent2(message="Analyze bug in module B")
+```
+
+### Pattern 3: Output Field Descriptions
+
+Enhance prompts with field descriptions:
+
+```python
+class MySignature(dspy.Signature):
+    """Analyze code architecture."""
+
+    message: str = dspy.InputField()
+    analysis: str = dspy.OutputField(
+        desc="A detailed markdown report with sections: "
+        "1) Architecture overview, 2) Key components, 3) Dependencies"
+    )
+
+agent = ClaudeAgent(MySignature, working_directory=".")
+result = agent(message="Analyze this codebase")
+
+# The description is automatically appended to the prompt
+```
+
+### Pattern 4: Inspecting Execution Trace
+
+Access detailed execution information:
+
+```python
+from claude_agent import ToolUseItem, ToolResultItem
+
+result = agent(message="Fix the bug")
+
+# Filter trace by type
+tool_uses = [item for item in result.trace if isinstance(item, ToolUseItem)]
+for tool in tool_uses:
+    print(f"Tool: {tool.tool_name}")
+    print(f"Input: {tool.tool_input}")
+
+tool_results = [item for item in result.trace if isinstance(item, ToolResultItem)]
+for result_item in tool_results:
+    print(f"Result: {result_item.content}")
+    print(f"Error: {result_item.is_error}")
+```
+
+### Pattern 5: Token Usage Tracking
+
+Monitor API usage:
+
+```python
+result = agent(message="...")
+
+print(f"Input tokens: {result.usage.input_tokens}")
+print(f"Cached tokens: {result.usage.cached_input_tokens}")
+print(f"Output tokens: {result.usage.output_tokens}")
+print(f"Total: {result.usage.total_tokens}")
+```
+
+### Pattern 6: Safe Execution with Permissions
+
+Control what the agent can do:
+
+```python
+# Read-only (safest)
+agent = ClaudeAgent(
+    sig,
+    working_directory=".",
+    permission_mode="default",
+    allowed_tools=["Read", "Glob", "Grep"],
+)
+
+# Auto-accept file edits
+agent = ClaudeAgent(
+    sig,
+    working_directory=".",
+    permission_mode="acceptEdits",
+    allowed_tools=["Read", "Write", "Edit"],
+)
+
+# Sandbox mode for command execution
+agent = ClaudeAgent(
+    sig,
+    working_directory=".",
+    sandbox={"enabled": True},
+)
+```
+
+## Advanced Examples
+
+### Example 1: Code Review Agent
+
+```python
+from pydantic import BaseModel, Field
+
+class CodeReview(BaseModel):
+    summary: str = Field(description="High-level summary")
+    issues: list[str] = Field(description="List of issues found")
+    severity: str = Field(description="critical, high, medium, or low")
+    recommendations: list[str] = Field(description="Actionable recommendations")
+
+sig = dspy.Signature('message:str -> review:CodeReview')
+
+agent = ClaudeAgent(
+    sig,
+    working_directory="/path/to/project",
+    model="sonnet",
+    permission_mode="default",
+    allowed_tools=["Read", "Glob", "Grep"],
+)
+
+result = agent(message="Review the changes in src/main.py")
+
+print(f"Severity: {result.review.severity}")
+for issue in result.review.issues:
+    print(f"- {issue}")
+```
+
+### Example 2: Iterative Debugging
+
+```python
+sig = dspy.Signature('message:str -> response:str')
+agent = ClaudeAgent(
+    sig,
+    working_directory=".",
+    permission_mode="acceptEdits",
+    allowed_tools=["Read", "Write", "Bash"],
+)
+
+# Turn 1: Find the bug
+result1 = agent(message="Find the bug in src/calculator.py")
+print(result1.response)
+
+# Turn 2: Propose a fix
+result2 = agent(message="What's the best way to fix it?")
+print(result2.response)
+
+# Turn 3: Implement the fix
+result3 = agent(message="Implement the fix")
+print(result3.response)
+
+# Turn 4: Write tests
+result4 = agent(message="Write tests for the fix")
+print(result4.response)
+```
+
+### Example 3: Async Usage
+
+```python
+import asyncio
+
+async def main():
+    sig = dspy.Signature('message:str -> answer:str')
+    agent = ClaudeAgent(sig, working_directory=".")
+
+    # Use aforward in async context
+    result = await agent.aforward(message="Analyze this code")
+    print(result.answer)
+
+    # Cleanup
+    await agent.disconnect()
+
+asyncio.run(main())
+```
+
+## Trace Item Types
+
+When accessing `result.trace`, you'll see various item types:
+
+| Type | Fields | Description |
+|------|--------|-------------|
+| `AgentMessageItem` | `text`, `model` | Agent's text response |
+| `ThinkingItem` | `text`, `model` | Agent's internal reasoning |
+| `ToolUseItem` | `tool_name`, `tool_input`, `tool_use_id` | Tool invocation |
+| `ToolResultItem` | `tool_name`, `tool_use_id`, `content`, `is_error` | Tool result |
+| `ErrorItem` | `message`, `error_type` | Error that occurred |
+
+## How It Works
+
+### Signature <20> Claude Flow
+
+```
+1. Define signature: 'message:str -> answer:str'
+
+2. ClaudeAgent validates (must have 1 input, 1 output)
+
+3. __init__ creates ClaudeSDKClient with options
+
+4. forward(message="...") extracts message
+
+5. If output field has desc <20> append to message
+
+6. If output type ` str <20> generate JSON schema
+
+7. Call client.query(message) with optional output_format
+
+8. Iterate through receive_response(), collect messages
+
+9. Parse response (JSON if Pydantic, str otherwise)
+
+10. Return Prediction(output=..., trace=..., usage=...)
+```
+
+### Output Type Handling
+
+**String output:**
+```python
+sig = dspy.Signature('message:str -> answer:str')
+# No schema passed to Claude Code
+# Response used as-is
+```
+
+**Pydantic output:**
+```python
+sig = dspy.Signature('message:str -> report:BugReport')
+# JSON schema generated from BugReport
+# Schema passed to Claude Code via output_format
+# Response parsed with BugReport.model_validate_json()
+```
+
+## Troubleshooting
+
+### Error: "ClaudeAgent requires exactly 1 input field"
+
+Your signature has too many or too few fields. ClaudeAgent expects exactly one input and one output:
+
+```python
+# L Wrong - multiple inputs
+sig = dspy.Signature('context:str, question:str -> answer:str')
+
+#  Correct - single input
+sig = dspy.Signature('message:str -> answer:str')
+```
+
+### Error: "Failed to parse Claude response as MyModel"
+
+The model returned JSON that doesn't match your Pydantic schema. Check:
+1. Schema is valid and clear
+2. Field descriptions are helpful
+3. Model has enough context to generate correct structure
+
+### Error: "Claude Code CLI not found"
+
+Install Claude Code CLI:
+```bash
+# Visit code.claude.com for installation instructions
+# or use npm:
+npm install -g @anthropic-ai/claude-code
+```
+
+### Async event loop issues
+
+Use `aforward()` when already in an async context:
+
+```python
+# L Don't do this in async context
+async def main():
+    result = agent(message="...")  # Can cause issues
+
+#  Do this instead
+async def main():
+    result = await agent.aforward(message="...")
+```
+
+## Design Philosophy
+
+### Why 1 input, 1 output?
+
+ClaudeAgent is designed for conversational agentic workflows. The input is always a message/prompt, and the output is always a response. This keeps the interface simple and predictable.
+
+For complex inputs, compose them into the message:
+
+```python
+# Instead of: 'context:str, question:str -> answer:str'
+message = f"Context: {context}\n\nQuestion: {question}"
+result = agent(message=message)
+```
+
+### Why stateful sessions?
+
+Agents often need multi-turn context (e.g., "fix the bug" <20> "write tests for it"). Stateful sessions make this natural without manual history management.
+
+Want fresh context? Create a new agent instance.
+
+### Why return trace + usage?
+
+Observability is critical for agentic systems. You need to know:
+- What tools were used
+- What the agent was thinking
+- How many tokens were consumed
+- If any errors occurred
+
+The trace provides full visibility into agent execution.
+
+## Comparison with CodexAgent
+
+| Feature | CodexAgent | ClaudeAgent |
+|---------|-----------|-------------|
+| SDK | OpenAI Codex SDK | Claude Code Python SDK |
+| Thread management | Built-in thread ID | Session-based (ClaudeSDKClient) |
+| Streaming | Yes | Yes (via receive_response) |
+| Async support | No | Yes (aforward) |
+| Tool types | Codex-specific | Claude Code tools (Bash, Read, Write, etc.) |
+| Sandbox | Simple mode enum | Detailed config dict |
+| Permission control | Sandbox modes | Permission modes + allowed_tools |
+
+## Examples Directory
+
+Check out the `examples/` directory for more:
+
+- `basic_string_output.py` - Simple string output
+- `pydantic_output.py` - Structured Pydantic output
+- `multi_turn_conversation.py` - Multi-turn conversation
+- `output_field_description.py` - Using output field descriptions
+- `inspect_trace.py` - Inspecting execution trace
+- `code_review_agent.py` - Advanced code review agent
+
+## Contributing
+
+Issues and PRs welcome! This is an implementation of Claude Code SDK integration with DSPy.
+
+## License
+
+See LICENSE file.
+
+## Related Documentation
+
+- [Claude Code SDK API Reference](https://docs.claude.com/en/agent-sdk/python)
+- [DSPy Documentation](https://dspy-docs.vercel.app/)
+- [Claude Code Documentation](https://code.claude.com/docs)
+
+---
+
+**Note:** This is a community implementation of Claude Code SDK integration with DSPy, inspired by the CodexAgent design pattern.