(no commit message)
This commit is contained in:
264
README.md
264
README.md
@@ -1,2 +1,264 @@
|
||||
# jtbd-agent
|
||||
|
||||
# JTBD Idea Validator
|
||||
|
||||
A **Jobs to be Done (JTBD)** analysis agent powered by DSPy that validates business ideas through comprehensive framework-based evaluation.
|
||||
|
||||
## What it does
|
||||
|
||||
This tool performs systematic business idea validation using JTBD methodology:
|
||||
|
||||
- **Assumption Deconstruction**: Extract and classify core business assumptions (1-3 levels)
|
||||
- **JTBD Analysis**: Generate 5 distinct job statements with Four Forces (push/pull/anxiety/inertia)
|
||||
- **Moat Analysis**: Assess competitive advantages using innovation layers
|
||||
- **Scoring & Judgment**: Evaluate ideas across 5 criteria with detailed rationales
|
||||
- **Validation Planning**: Create actionable plans for assumption testing
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Setup environment
|
||||
python -m venv .venv && source .venv/bin/activate # Windows: .venv\Scripts\activate
|
||||
pip install -U pip
|
||||
pip install -e .
|
||||
|
||||
# Configure LLM (required)
|
||||
export OPENAI_API_KEY=... # for OpenAI models
|
||||
export ANTHROPIC_API_KEY=... # for Claude models
|
||||
|
||||
# Run analysis on example
|
||||
python run_direct.py examples/rehab_exercise_tracking_rich.json
|
||||
|
||||
# Or specify custom output location
|
||||
python run_direct.py examples/insurance_photo_ai.json --output custom_reports/
|
||||
```
|
||||
|
||||
## Output Files
|
||||
|
||||
The tool generates organized reports in timestamped directories:
|
||||
|
||||
- **Gamma Presentations**: `gamma/presentation.md` (Gamma-ready) + `gamma/presentation.html` (preview)
|
||||
- **CSV Exports**: `csv/` - Structured data for spreadsheet analysis
|
||||
- **JSON Data**: `json/analysis_data.json` - Raw analysis data
|
||||
- **Charts**: `assets/` - Radar charts, waterfall charts, and Four Forces diagrams
|
||||
|
||||
## Technical Architecture
|
||||
|
||||
This implementation uses **DSPy** (Declarative Self-improving Language Programs) for structured LLM interactions through **Signatures** and **Modules**.
|
||||
|
||||
### DSPy Signatures
|
||||
|
||||
Signatures define input/output schemas for LLM tasks:
|
||||
|
||||
```python
|
||||
class DeconstructSig(dspy.Signature):
|
||||
"""Extract assumptions and classify levels.
|
||||
Return JSON list of objects: [{text, level(1..3), confidence, evidence:[]}]"""
|
||||
idea: str = dspy.InputField()
|
||||
hunches: List[str] = dspy.InputField()
|
||||
assumptions_json: str = dspy.OutputField()
|
||||
|
||||
class JobsSig(dspy.Signature):
|
||||
"""Generate 5 distinct JTBD statements with Four Forces each."""
|
||||
context: str = dspy.InputField()
|
||||
constraints: str = dspy.InputField()
|
||||
jobs_json: str = dspy.OutputField()
|
||||
```
|
||||
|
||||
### DSPy Modules
|
||||
|
||||
Modules implement business logic with automatic prompt optimization:
|
||||
|
||||
- **`Deconstruct`**: Extracts assumptions with confidence scoring
|
||||
- **`Jobs`**: Generates JTBD statements with Four Forces analysis
|
||||
- **`Moat`**: Applies Doblin innovation framework + strategic triggers
|
||||
- **`JudgeScore`**: Evaluates ideas across 5 standardized criteria:
|
||||
- Underserved Opportunity
|
||||
- Strategic Impact
|
||||
- Market Scale
|
||||
- Solution Differentiability
|
||||
- Business Model Innovation
|
||||
|
||||
### Dual-Judge Arbitration
|
||||
|
||||
The system uses two independent judges with tie-breaking for scoring reliability:
|
||||
|
||||
```python
|
||||
USE_DOUBLE_JUDGE = os.getenv("JTBD_DOUBLE_JUDGE", "1") == "1" # default ON
|
||||
|
||||
def judge_with_arbitration(summary: str):
|
||||
if USE_DOUBLE_JUDGE:
|
||||
score1 = JudgeScore()(summary=summary)
|
||||
score2 = JudgeScore()(summary=summary)
|
||||
return merge_scores(score1, score2) # tie-breaker logic
|
||||
return JudgeScore()(summary=summary)
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
**Model Selection**: Edit `plugins/llm_dspy.py` → `configure_lm()` or set `JTBD_DSPY_MODEL`:
|
||||
|
||||
```bash
|
||||
export JTBD_DSPY_MODEL="gpt-4o-mini" # OpenAI
|
||||
export JTBD_DSPY_MODEL="claude-3-5-sonnet-20240620" # Anthropic
|
||||
```
|
||||
|
||||
**Other Options**:
|
||||
|
||||
- `JTBD_LLM_TEMPERATURE=0.2` - Response randomness (0.0-1.0)
|
||||
- `JTBD_DOUBLE_JUDGE=1` - Enable dual-judge arbitration (default: enabled)
|
||||
|
||||
## Input Format
|
||||
|
||||
Ideas are defined in JSON files with the following structure:
|
||||
|
||||
```json
|
||||
{
|
||||
"idea_id": "urn:idea:example:001",
|
||||
"title": "Your business idea title",
|
||||
"hunches": [
|
||||
"Key assumption about the problem",
|
||||
"Belief about customer behavior",
|
||||
"Market hypothesis"
|
||||
],
|
||||
"problem_statement": "Clear description of the problem",
|
||||
"solution_overview": "How your idea solves the problem",
|
||||
"target_customer": {
|
||||
"primary": "Main customer segment",
|
||||
"secondary": "Secondary users",
|
||||
"demographics": "Age, profession, context"
|
||||
},
|
||||
"value_propositions": ["Key benefit 1", "Key benefit 2"],
|
||||
"competitive_landscape": ["Competitor 1", "Competitor 2"],
|
||||
"revenue_streams": ["Revenue model 1", "Revenue model 2"]
|
||||
}
|
||||
```
|
||||
|
||||
See `examples/` directory for complete examples.
|
||||
|
||||
## Alternative Execution Methods
|
||||
|
||||
### Direct Python Script
|
||||
|
||||
```bash
|
||||
python run_direct.py your_idea.json
|
||||
```
|
||||
|
||||
### FastAPI Service (Optional)
|
||||
|
||||
Run as a service with HTTP endpoints:
|
||||
|
||||
```bash
|
||||
uvicorn service.dspy_sidecar:app --port 8088 --reload
|
||||
```
|
||||
|
||||
Exposes endpoints: `/deconstruct`, `/jobs`, `/moat`, `/judge`
|
||||
|
||||
### Prefect Flow (Advanced)
|
||||
|
||||
For complex orchestration scenarios using the Prefect workflow engine.
|
||||
|
||||
## Advanced Features
|
||||
|
||||
### Judge Optimization with DSPy
|
||||
|
||||
The system supports **compiled judge models** using DSPy's GEPA optimizer (reflective prompt evolution):
|
||||
|
||||
```bash
|
||||
# 1. Add training data to data/judge_train.jsonl
|
||||
# Format: {"summary": "...", "scorecard": {"criteria":[...], "total": 6.7}}
|
||||
|
||||
# 2. Train the judge using GEPA (evolutionary optimizer)
|
||||
python tools/optimize_judge.py --train data/judge_train.jsonl --out artifacts/judge_compiled.dspy --budget medium
|
||||
|
||||
# 3. Use the compiled judge (automatically loaded at runtime)
|
||||
export JTBD_JUDGE_COMPILED=artifacts/judge_compiled.dspy
|
||||
python run_direct.py your_idea.json
|
||||
```
|
||||
|
||||
**GEPA** is an evolutionary optimizer for prompt optimization that:
|
||||
- Captures full execution traces of DSPy modules
|
||||
- Uses reflection to evolve text components (prompts/instructions)
|
||||
- Allows textual feedback at predictor or system level
|
||||
- Outperforms reinforcement learning approaches
|
||||
|
||||
From the actual implementation in `tools/optimize_judge.py`:
|
||||
|
||||
```python
|
||||
from dspy.teleprompt import GEPA
|
||||
|
||||
def non_decreasing_metric(example, pred, trace=None, pred_name=None, pred_trace=None):
|
||||
"""Returns 1 if predicted total >= gold total, else 0."""
|
||||
try:
|
||||
p = json.loads(pred.scorecard_json)
|
||||
g = json.loads(example.scorecard_json)
|
||||
return 1.0 if p.get("total",0) >= g.get("total",0) else 0.0
|
||||
except Exception:
|
||||
return 0.0
|
||||
|
||||
# Budget options: "light", "medium", "heavy"
|
||||
tele = GEPA(metric=non_decreasing_metric, auto=budget)
|
||||
compiled = tele.compile(dspy.Predict(JudgeScoreSig), trainset=train)
|
||||
```
|
||||
|
||||
The compiled judge replaces the default `dspy.Predict` with an optimized program:
|
||||
|
||||
```python
|
||||
_compiled_judge = None
|
||||
if JUDGE_COMPILED_PATH and os.path.exists(JUDGE_COMPILED_PATH):
|
||||
with open(JUDGE_COMPILED_PATH, "rb") as f:
|
||||
_compiled_judge = pickle.load(f)
|
||||
|
||||
class JudgeScore(dspy.Module):
|
||||
def __init__(self):
|
||||
self.p = _compiled_judge or dspy.Predict(JudgeScoreSig) # fallback
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
- `OPENAI_API_KEY` / `ANTHROPIC_API_KEY` - API keys for LLM providers
|
||||
- `JTBD_DSPY_MODEL` - Model name (default: "gpt-4o-mini")
|
||||
- `JTBD_LLM_TEMPERATURE` - Temperature setting (default: 0.2)
|
||||
- `JTBD_LLM_SEED` - Random seed for reproducibility (default: 42)
|
||||
- `JTBD_DOUBLE_JUDGE` - Enable dual-judge arbitration (default: 1)
|
||||
- `JTBD_JUDGE_COMPILED` - Path to compiled judge model
|
||||
- `OTEL_SERVICE_NAME` / `DEPLOY_ENV` - Identify the service in OTLP exports (defaults: `jtbd-dspy-sidecar`, `dev`)
|
||||
- `OTLP_ENDPOINT` / `OTLP_HEADERS` - Configure OTLP HTTP exporter endpoint and optional headers
|
||||
- `MODAIC_AGENT_ID` / `MODAIC_AGENT_REV` - Load a precompiled Modaic agent instead of the local default
|
||||
- `MODAIC_TOKEN` - Authentication token for private Modaic repositories
|
||||
- `RETRIEVER_KIND` / `RETRIEVER_NOTES` - Retriever selection (e.g., `notes`) and seed data for contextual hints
|
||||
- `API_BEARER_TOKEN` - Optional bearer token required by the FastAPI service
|
||||
- `STREAM_CHUNK_SIZE` - Chunk size for SSE streaming responses (default: 60)
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
├── contracts/ # Pydantic models (v1 frozen contracts)
|
||||
├── core/ # Main business logic
|
||||
│ ├── pipeline.py # Main analysis pipeline
|
||||
│ ├── score.py # Scoring algorithms
|
||||
│ ├── plan.py # Validation planning
|
||||
│ └── export_*.py # Output formatters
|
||||
├── plugins/ # External integrations
|
||||
│ ├── llm_dspy.py # DSPy LLM interface
|
||||
│ └── charts_quickchart.py # Chart generation
|
||||
├── service/ # FastAPI service
|
||||
├── orchestration/ # Prefect flows
|
||||
├── examples/ # Sample business ideas
|
||||
├── tools/ # Optimization utilities
|
||||
└── run_direct.py # Main CLI entry point
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **DSPy**: Language model orchestration framework
|
||||
- **Pydantic**: Data validation and serialization
|
||||
- **FastAPI/Uvicorn**: Optional HTTP service
|
||||
- **Modaic**: Precompiled agent runtime with retriever support
|
||||
- **OpenTelemetry**: Request tracing + OTLP exporter (service observability)
|
||||
- **sse-starlette**: Server-Sent Events streaming for OpenAI-compatible responses
|
||||
- **Prefect**: Optional workflow orchestration
|
||||
- **Requests**: HTTP client for external services
|
||||
|
||||
## Contract Stability
|
||||
|
||||
Data contracts in `contracts/*_v1.py` are frozen. For changes, create new `v2` versions rather than modifying existing contracts to ensure backward compatibility.
|
||||
|
||||
Reference in New Issue
Block a user