265 lines
9.2 KiB
Markdown
265 lines
9.2 KiB
Markdown
|
|
# JTBD Idea Validator
|
|
|
|
A **Jobs to be Done (JTBD)** analysis agent powered by DSPy that validates business ideas through comprehensive framework-based evaluation.
|
|
|
|
## What it does
|
|
|
|
This tool performs systematic business idea validation using JTBD methodology:
|
|
|
|
- **Assumption Deconstruction**: Extract and classify core business assumptions (1-3 levels)
|
|
- **JTBD Analysis**: Generate 5 distinct job statements with Four Forces (push/pull/anxiety/inertia)
|
|
- **Moat Analysis**: Assess competitive advantages using innovation layers
|
|
- **Scoring & Judgment**: Evaluate ideas across 5 criteria with detailed rationales
|
|
- **Validation Planning**: Create actionable plans for assumption testing
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# Setup environment
|
|
python -m venv .venv && source .venv/bin/activate # Windows: .venv\Scripts\activate
|
|
pip install -U pip
|
|
pip install -e .
|
|
|
|
# Configure LLM (required)
|
|
export OPENAI_API_KEY=... # for OpenAI models
|
|
export ANTHROPIC_API_KEY=... # for Claude models
|
|
|
|
# Run analysis on example
|
|
python run_direct.py examples/rehab_exercise_tracking_rich.json
|
|
|
|
# Or specify custom output location
|
|
python run_direct.py examples/insurance_photo_ai.json --output custom_reports/
|
|
```
|
|
|
|
## Output Files
|
|
|
|
The tool generates organized reports in timestamped directories:
|
|
|
|
- **Gamma Presentations**: `gamma/presentation.md` (Gamma-ready) + `gamma/presentation.html` (preview)
|
|
- **CSV Exports**: `csv/` - Structured data for spreadsheet analysis
|
|
- **JSON Data**: `json/analysis_data.json` - Raw analysis data
|
|
- **Charts**: `assets/` - Radar charts, waterfall charts, and Four Forces diagrams
|
|
|
|
## Technical Architecture
|
|
|
|
This implementation uses **DSPy** (Declarative Self-improving Language Programs) for structured LLM interactions through **Signatures** and **Modules**.
|
|
|
|
### DSPy Signatures
|
|
|
|
Signatures define input/output schemas for LLM tasks:
|
|
|
|
```python
|
|
class DeconstructSig(dspy.Signature):
|
|
"""Extract assumptions and classify levels.
|
|
Return JSON list of objects: [{text, level(1..3), confidence, evidence:[]}]"""
|
|
idea: str = dspy.InputField()
|
|
hunches: List[str] = dspy.InputField()
|
|
assumptions_json: str = dspy.OutputField()
|
|
|
|
class JobsSig(dspy.Signature):
|
|
"""Generate 5 distinct JTBD statements with Four Forces each."""
|
|
context: str = dspy.InputField()
|
|
constraints: str = dspy.InputField()
|
|
jobs_json: str = dspy.OutputField()
|
|
```
|
|
|
|
### DSPy Modules
|
|
|
|
Modules implement business logic with automatic prompt optimization:
|
|
|
|
- **`Deconstruct`**: Extracts assumptions with confidence scoring
|
|
- **`Jobs`**: Generates JTBD statements with Four Forces analysis
|
|
- **`Moat`**: Applies Doblin innovation framework + strategic triggers
|
|
- **`JudgeScore`**: Evaluates ideas across 5 standardized criteria:
|
|
- Underserved Opportunity
|
|
- Strategic Impact
|
|
- Market Scale
|
|
- Solution Differentiability
|
|
- Business Model Innovation
|
|
|
|
### Dual-Judge Arbitration
|
|
|
|
The system uses two independent judges with tie-breaking for scoring reliability:
|
|
|
|
```python
|
|
USE_DOUBLE_JUDGE = os.getenv("JTBD_DOUBLE_JUDGE", "1") == "1" # default ON
|
|
|
|
def judge_with_arbitration(summary: str):
|
|
if USE_DOUBLE_JUDGE:
|
|
score1 = JudgeScore()(summary=summary)
|
|
score2 = JudgeScore()(summary=summary)
|
|
return merge_scores(score1, score2) # tie-breaker logic
|
|
return JudgeScore()(summary=summary)
|
|
```
|
|
|
|
## Configuration
|
|
|
|
**Model Selection**: Edit `plugins/llm_dspy.py` → `configure_lm()` or set `JTBD_DSPY_MODEL`:
|
|
|
|
```bash
|
|
export JTBD_DSPY_MODEL="gpt-4o-mini" # OpenAI
|
|
export JTBD_DSPY_MODEL="claude-3-5-sonnet-20240620" # Anthropic
|
|
```
|
|
|
|
**Other Options**:
|
|
|
|
- `JTBD_LLM_TEMPERATURE=0.2` - Response randomness (0.0-1.0)
|
|
- `JTBD_DOUBLE_JUDGE=1` - Enable dual-judge arbitration (default: enabled)
|
|
|
|
## Input Format
|
|
|
|
Ideas are defined in JSON files with the following structure:
|
|
|
|
```json
|
|
{
|
|
"idea_id": "urn:idea:example:001",
|
|
"title": "Your business idea title",
|
|
"hunches": [
|
|
"Key assumption about the problem",
|
|
"Belief about customer behavior",
|
|
"Market hypothesis"
|
|
],
|
|
"problem_statement": "Clear description of the problem",
|
|
"solution_overview": "How your idea solves the problem",
|
|
"target_customer": {
|
|
"primary": "Main customer segment",
|
|
"secondary": "Secondary users",
|
|
"demographics": "Age, profession, context"
|
|
},
|
|
"value_propositions": ["Key benefit 1", "Key benefit 2"],
|
|
"competitive_landscape": ["Competitor 1", "Competitor 2"],
|
|
"revenue_streams": ["Revenue model 1", "Revenue model 2"]
|
|
}
|
|
```
|
|
|
|
See `examples/` directory for complete examples.
|
|
|
|
## Alternative Execution Methods
|
|
|
|
### Direct Python Script
|
|
|
|
```bash
|
|
python run_direct.py your_idea.json
|
|
```
|
|
|
|
### FastAPI Service (Optional)
|
|
|
|
Run as a service with HTTP endpoints:
|
|
|
|
```bash
|
|
uvicorn service.dspy_sidecar:app --port 8088 --reload
|
|
```
|
|
|
|
Exposes endpoints: `/deconstruct`, `/jobs`, `/moat`, `/judge`
|
|
|
|
### Prefect Flow (Advanced)
|
|
|
|
For complex orchestration scenarios using the Prefect workflow engine.
|
|
|
|
## Advanced Features
|
|
|
|
### Judge Optimization with DSPy
|
|
|
|
The system supports **compiled judge models** using DSPy's GEPA optimizer (reflective prompt evolution):
|
|
|
|
```bash
|
|
# 1. Add training data to data/judge_train.jsonl
|
|
# Format: {"summary": "...", "scorecard": {"criteria":[...], "total": 6.7}}
|
|
|
|
# 2. Train the judge using GEPA (evolutionary optimizer)
|
|
python tools/optimize_judge.py --train data/judge_train.jsonl --out artifacts/judge_compiled.dspy --budget medium
|
|
|
|
# 3. Use the compiled judge (automatically loaded at runtime)
|
|
export JTBD_JUDGE_COMPILED=artifacts/judge_compiled.dspy
|
|
python run_direct.py your_idea.json
|
|
```
|
|
|
|
**GEPA** is an evolutionary optimizer for prompt optimization that:
|
|
- Captures full execution traces of DSPy modules
|
|
- Uses reflection to evolve text components (prompts/instructions)
|
|
- Allows textual feedback at predictor or system level
|
|
- Outperforms reinforcement learning approaches
|
|
|
|
From the actual implementation in `tools/optimize_judge.py`:
|
|
|
|
```python
|
|
from dspy.teleprompt import GEPA
|
|
|
|
def non_decreasing_metric(example, pred, trace=None, pred_name=None, pred_trace=None):
|
|
"""Returns 1 if predicted total >= gold total, else 0."""
|
|
try:
|
|
p = json.loads(pred.scorecard_json)
|
|
g = json.loads(example.scorecard_json)
|
|
return 1.0 if p.get("total",0) >= g.get("total",0) else 0.0
|
|
except Exception:
|
|
return 0.0
|
|
|
|
# Budget options: "light", "medium", "heavy"
|
|
tele = GEPA(metric=non_decreasing_metric, auto=budget)
|
|
compiled = tele.compile(dspy.Predict(JudgeScoreSig), trainset=train)
|
|
```
|
|
|
|
The compiled judge replaces the default `dspy.Predict` with an optimized program:
|
|
|
|
```python
|
|
_compiled_judge = None
|
|
if JUDGE_COMPILED_PATH and os.path.exists(JUDGE_COMPILED_PATH):
|
|
with open(JUDGE_COMPILED_PATH, "rb") as f:
|
|
_compiled_judge = pickle.load(f)
|
|
|
|
class JudgeScore(dspy.Module):
|
|
def __init__(self):
|
|
self.p = _compiled_judge or dspy.Predict(JudgeScoreSig) # fallback
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
- `OPENAI_API_KEY` / `ANTHROPIC_API_KEY` - API keys for LLM providers
|
|
- `JTBD_DSPY_MODEL` - Model name (default: "gpt-4o-mini")
|
|
- `JTBD_LLM_TEMPERATURE` - Temperature setting (default: 0.2)
|
|
- `JTBD_LLM_SEED` - Random seed for reproducibility (default: 42)
|
|
- `JTBD_DOUBLE_JUDGE` - Enable dual-judge arbitration (default: 1)
|
|
- `JTBD_JUDGE_COMPILED` - Path to compiled judge model
|
|
- `OTEL_SERVICE_NAME` / `DEPLOY_ENV` - Identify the service in OTLP exports (defaults: `jtbd-dspy-sidecar`, `dev`)
|
|
- `OTLP_ENDPOINT` / `OTLP_HEADERS` - Configure OTLP HTTP exporter endpoint and optional headers
|
|
- `MODAIC_AGENT_ID` / `MODAIC_AGENT_REV` - Load a precompiled Modaic agent instead of the local default
|
|
- `MODAIC_TOKEN` - Authentication token for private Modaic repositories
|
|
- `RETRIEVER_KIND` / `RETRIEVER_NOTES` - Retriever selection (e.g., `notes`) and seed data for contextual hints
|
|
- `API_BEARER_TOKEN` - Optional bearer token required by the FastAPI service
|
|
- `STREAM_CHUNK_SIZE` - Chunk size for SSE streaming responses (default: 60)
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
├── contracts/ # Pydantic models (v1 frozen contracts)
|
|
├── core/ # Main business logic
|
|
│ ├── pipeline.py # Main analysis pipeline
|
|
│ ├── score.py # Scoring algorithms
|
|
│ ├── plan.py # Validation planning
|
|
│ └── export_*.py # Output formatters
|
|
├── plugins/ # External integrations
|
|
│ ├── llm_dspy.py # DSPy LLM interface
|
|
│ └── charts_quickchart.py # Chart generation
|
|
├── service/ # FastAPI service
|
|
├── orchestration/ # Prefect flows
|
|
├── examples/ # Sample business ideas
|
|
├── tools/ # Optimization utilities
|
|
└── run_direct.py # Main CLI entry point
|
|
```
|
|
|
|
## Dependencies
|
|
|
|
- **DSPy**: Language model orchestration framework
|
|
- **Pydantic**: Data validation and serialization
|
|
- **FastAPI/Uvicorn**: Optional HTTP service
|
|
- **Modaic**: Precompiled agent runtime with retriever support
|
|
- **OpenTelemetry**: Request tracing + OTLP exporter (service observability)
|
|
- **sse-starlette**: Server-Sent Events streaming for OpenAI-compatible responses
|
|
- **Prefect**: Optional workflow orchestration
|
|
- **Requests**: HTTP client for external services
|
|
|
|
## Contract Stability
|
|
|
|
Data contracts in `contracts/*_v1.py` are frozen. For changes, create new `v2` versions rather than modifying existing contracts to ensure backward compatibility.
|