2025-10-19 14:16:34 -05:00
2025-10-19 14:16:34 -05:00
2025-10-19 14:16:34 -05:00
2025-10-19 14:16:34 -05:00
2025-10-19 14:16:34 -05:00
2025-10-19 14:16:34 -05:00
2025-10-19 14:16:34 -05:00
2025-10-19 14:16:34 -05:00
2025-10-19 14:16:34 -05:00
2025-10-19 14:16:34 -05:00

JTBD Idea Validator

A Jobs to be Done (JTBD) analysis agent powered by DSPy that validates business ideas through comprehensive framework-based evaluation.

What it does

This tool performs systematic business idea validation using JTBD methodology:

  • Assumption Deconstruction: Extract and classify core business assumptions (1-3 levels)
  • JTBD Analysis: Generate 5 distinct job statements with Four Forces (push/pull/anxiety/inertia)
  • Moat Analysis: Assess competitive advantages using innovation layers
  • Scoring & Judgment: Evaluate ideas across 5 criteria with detailed rationales
  • Validation Planning: Create actionable plans for assumption testing

Quick Start

# Setup environment
python -m venv .venv && source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -U pip
pip install -e .

# Configure LLM (required)
export OPENAI_API_KEY=...            # for OpenAI models
export ANTHROPIC_API_KEY=...         # for Claude models

# Run analysis on example
python run_direct.py examples/rehab_exercise_tracking_rich.json

# Or specify custom output location
python run_direct.py examples/insurance_photo_ai.json --output custom_reports/

Output Files

The tool generates organized reports in timestamped directories:

  • Gamma Presentations: gamma/presentation.md (Gamma-ready) + gamma/presentation.html (preview)
  • CSV Exports: csv/ - Structured data for spreadsheet analysis
  • JSON Data: json/analysis_data.json - Raw analysis data
  • Charts: assets/ - Radar charts, waterfall charts, and Four Forces diagrams

Technical Architecture

This implementation uses DSPy (Declarative Self-improving Language Programs) for structured LLM interactions through Signatures and Modules.

DSPy Signatures

Signatures define input/output schemas for LLM tasks:

class DeconstructSig(dspy.Signature):
    """Extract assumptions and classify levels.
    Return JSON list of objects: [{text, level(1..3), confidence, evidence:[]}]"""
    idea: str = dspy.InputField()
    hunches: List[str] = dspy.InputField()
    assumptions_json: str = dspy.OutputField()

class JobsSig(dspy.Signature):
    """Generate 5 distinct JTBD statements with Four Forces each."""
    context: str = dspy.InputField()
    constraints: str = dspy.InputField()
    jobs_json: str = dspy.OutputField()

DSPy Modules

Modules implement business logic with automatic prompt optimization:

  • Deconstruct: Extracts assumptions with confidence scoring
  • Jobs: Generates JTBD statements with Four Forces analysis
  • Moat: Applies Doblin innovation framework + strategic triggers
  • JudgeScore: Evaluates ideas across 5 standardized criteria:
    • Underserved Opportunity
    • Strategic Impact
    • Market Scale
    • Solution Differentiability
    • Business Model Innovation

Dual-Judge Arbitration

The system uses two independent judges with tie-breaking for scoring reliability:

USE_DOUBLE_JUDGE = os.getenv("JTBD_DOUBLE_JUDGE", "1") == "1"  # default ON

def judge_with_arbitration(summary: str):
    if USE_DOUBLE_JUDGE:
        score1 = JudgeScore()(summary=summary)
        score2 = JudgeScore()(summary=summary) 
        return merge_scores(score1, score2)  # tie-breaker logic
    return JudgeScore()(summary=summary)

Configuration

Model Selection: Edit plugins/llm_dspy.pyconfigure_lm() or set JTBD_DSPY_MODEL:

export JTBD_DSPY_MODEL="gpt-4o-mini"              # OpenAI
export JTBD_DSPY_MODEL="claude-3-5-sonnet-20240620"  # Anthropic

Other Options:

  • JTBD_LLM_TEMPERATURE=0.2 - Response randomness (0.0-1.0)
  • JTBD_DOUBLE_JUDGE=1 - Enable dual-judge arbitration (default: enabled)

Input Format

Ideas are defined in JSON files with the following structure:

{
  "idea_id": "urn:idea:example:001",
  "title": "Your business idea title",
  "hunches": [
    "Key assumption about the problem",
    "Belief about customer behavior",
    "Market hypothesis"
  ],
  "problem_statement": "Clear description of the problem",
  "solution_overview": "How your idea solves the problem",
  "target_customer": {
    "primary": "Main customer segment",
    "secondary": "Secondary users",
    "demographics": "Age, profession, context"
  },
  "value_propositions": ["Key benefit 1", "Key benefit 2"],
  "competitive_landscape": ["Competitor 1", "Competitor 2"],
  "revenue_streams": ["Revenue model 1", "Revenue model 2"]
}

See examples/ directory for complete examples.

Alternative Execution Methods

Direct Python Script

python run_direct.py your_idea.json

FastAPI Service (Optional)

Run as a service with HTTP endpoints:

uvicorn service.dspy_sidecar:app --port 8088 --reload

Exposes endpoints: /deconstruct, /jobs, /moat, /judge

Prefect Flow (Advanced)

For complex orchestration scenarios using the Prefect workflow engine.

Advanced Features

Judge Optimization with DSPy

The system supports compiled judge models using DSPy's GEPA optimizer (reflective prompt evolution):

# 1. Add training data to data/judge_train.jsonl
# Format: {"summary": "...", "scorecard": {"criteria":[...], "total": 6.7}}

# 2. Train the judge using GEPA (evolutionary optimizer)
python tools/optimize_judge.py --train data/judge_train.jsonl --out artifacts/judge_compiled.dspy --budget medium

# 3. Use the compiled judge (automatically loaded at runtime)
export JTBD_JUDGE_COMPILED=artifacts/judge_compiled.dspy
python run_direct.py your_idea.json

GEPA is an evolutionary optimizer for prompt optimization that:

  • Captures full execution traces of DSPy modules
  • Uses reflection to evolve text components (prompts/instructions)
  • Allows textual feedback at predictor or system level
  • Outperforms reinforcement learning approaches

From the actual implementation in tools/optimize_judge.py:

from dspy.teleprompt import GEPA

def non_decreasing_metric(example, pred, trace=None, pred_name=None, pred_trace=None):
    """Returns 1 if predicted total >= gold total, else 0."""
    try:
        p = json.loads(pred.scorecard_json)
        g = json.loads(example.scorecard_json)
        return 1.0 if p.get("total",0) >= g.get("total",0) else 0.0
    except Exception:
        return 0.0

# Budget options: "light", "medium", "heavy"
tele = GEPA(metric=non_decreasing_metric, auto=budget)
compiled = tele.compile(dspy.Predict(JudgeScoreSig), trainset=train)

The compiled judge replaces the default dspy.Predict with an optimized program:

_compiled_judge = None
if JUDGE_COMPILED_PATH and os.path.exists(JUDGE_COMPILED_PATH):
    with open(JUDGE_COMPILED_PATH, "rb") as f:
        _compiled_judge = pickle.load(f)

class JudgeScore(dspy.Module):
    def __init__(self):
        self.p = _compiled_judge or dspy.Predict(JudgeScoreSig)  # fallback

Environment Variables

  • OPENAI_API_KEY / ANTHROPIC_API_KEY - API keys for LLM providers
  • JTBD_DSPY_MODEL - Model name (default: "gpt-4o-mini")
  • JTBD_LLM_TEMPERATURE - Temperature setting (default: 0.2)
  • JTBD_LLM_SEED - Random seed for reproducibility (default: 42)
  • JTBD_DOUBLE_JUDGE - Enable dual-judge arbitration (default: 1)
  • JTBD_JUDGE_COMPILED - Path to compiled judge model
  • OTEL_SERVICE_NAME / DEPLOY_ENV - Identify the service in OTLP exports (defaults: jtbd-dspy-sidecar, dev)
  • OTLP_ENDPOINT / OTLP_HEADERS - Configure OTLP HTTP exporter endpoint and optional headers
  • MODAIC_AGENT_ID / MODAIC_AGENT_REV - Load a precompiled Modaic agent instead of the local default
  • MODAIC_TOKEN - Authentication token for private Modaic repositories
  • RETRIEVER_KIND / RETRIEVER_NOTES - Retriever selection (e.g., notes) and seed data for contextual hints
  • API_BEARER_TOKEN - Optional bearer token required by the FastAPI service
  • STREAM_CHUNK_SIZE - Chunk size for SSE streaming responses (default: 60)

Project Structure

├── contracts/          # Pydantic models (v1 frozen contracts)
├── core/              # Main business logic
│   ├── pipeline.py    # Main analysis pipeline
│   ├── score.py       # Scoring algorithms
│   ├── plan.py        # Validation planning
│   └── export_*.py    # Output formatters
├── plugins/           # External integrations
│   ├── llm_dspy.py    # DSPy LLM interface
│   └── charts_quickchart.py  # Chart generation
├── service/           # FastAPI service
├── orchestration/     # Prefect flows
├── examples/          # Sample business ideas
├── tools/            # Optimization utilities
└── run_direct.py     # Main CLI entry point

Dependencies

  • DSPy: Language model orchestration framework
  • Pydantic: Data validation and serialization
  • FastAPI/Uvicorn: Optional HTTP service
  • Modaic: Precompiled agent runtime with retriever support
  • OpenTelemetry: Request tracing + OTLP exporter (service observability)
  • sse-starlette: Server-Sent Events streaming for OpenAI-compatible responses
  • Prefect: Optional workflow orchestration
  • Requests: HTTP client for external services

Contract Stability

Data contracts in contracts/*_v1.py are frozen. For changes, create new v2 versions rather than modifying existing contracts to ensure backward compatibility.

Description
No description provided
Readme 40 KiB
Languages
Python 100%