(no commit message)

2025-10-19 14:16:34 -05:00
parent d33cef379c
commit c1847586be
13 changed files with 884 additions and 1 deletions
--- a/README.md
+++ b/README.md
@@ -1,2 +1,264 @@
 # jtbd-agent
 # JTBD Idea Validator
 A **Jobs to be Done (JTBD)** analysis agent powered by DSPy that validates business ideas through comprehensive framework-based evaluation.
 ## What it does
 This tool performs systematic business idea validation using JTBD methodology:
 - **Assumption Deconstruction**: Extract and classify core business assumptions (1-3 levels)
 - **JTBD Analysis**: Generate 5 distinct job statements with Four Forces (push/pull/anxiety/inertia)
 - **Moat Analysis**: Assess competitive advantages using innovation layers
 - **Scoring & Judgment**: Evaluate ideas across 5 criteria with detailed rationales
 - **Validation Planning**: Create actionable plans for assumption testing
 ## Quick Start
 ```bash
 # Setup environment
 python -m venv .venv && source .venv/bin/activate  # Windows: .venv\Scripts\activate
 pip install -U pip
 pip install -e .
 # Configure LLM (required)
 export OPENAI_API_KEY=...            # for OpenAI models
 export ANTHROPIC_API_KEY=...         # for Claude models
 # Run analysis on example
 python run_direct.py examples/rehab_exercise_tracking_rich.json
 # Or specify custom output location
 python run_direct.py examples/insurance_photo_ai.json --output custom_reports/
 ```
 ## Output Files
 The tool generates organized reports in timestamped directories:
 - **Gamma Presentations**: `gamma/presentation.md` (Gamma-ready) + `gamma/presentation.html` (preview)
 - **CSV Exports**: `csv/` - Structured data for spreadsheet analysis
 - **JSON Data**: `json/analysis_data.json` - Raw analysis data
 - **Charts**: `assets/` - Radar charts, waterfall charts, and Four Forces diagrams
 ## Technical Architecture
 This implementation uses **DSPy** (Declarative Self-improving Language Programs) for structured LLM interactions through **Signatures** and **Modules**.
 ### DSPy Signatures
 Signatures define input/output schemas for LLM tasks:
 ```python
 class DeconstructSig(dspy.Signature):
    """Extract assumptions and classify levels.
    Return JSON list of objects: [{text, level(1..3), confidence, evidence:[]}]"""
    idea: str = dspy.InputField()
    hunches: List[str] = dspy.InputField()
    assumptions_json: str = dspy.OutputField()
 class JobsSig(dspy.Signature):
    """Generate 5 distinct JTBD statements with Four Forces each."""
    context: str = dspy.InputField()
    constraints: str = dspy.InputField()
    jobs_json: str = dspy.OutputField()
 ```
 ### DSPy Modules
 Modules implement business logic with automatic prompt optimization:
 - **`Deconstruct`**: Extracts assumptions with confidence scoring
 - **`Jobs`**: Generates JTBD statements with Four Forces analysis
 - **`Moat`**: Applies Doblin innovation framework + strategic triggers
 - **`JudgeScore`**: Evaluates ideas across 5 standardized criteria:
  - Underserved Opportunity
  - Strategic Impact  
  - Market Scale
  - Solution Differentiability
  - Business Model Innovation
 ### Dual-Judge Arbitration
 The system uses two independent judges with tie-breaking for scoring reliability:
 ```python
 USE_DOUBLE_JUDGE = os.getenv("JTBD_DOUBLE_JUDGE", "1") == "1"  # default ON
 def judge_with_arbitration(summary: str):
    if USE_DOUBLE_JUDGE:
        score1 = JudgeScore()(summary=summary)
        score2 = JudgeScore()(summary=summary) 
        return merge_scores(score1, score2)  # tie-breaker logic
    return JudgeScore()(summary=summary)
 ```
 ## Configuration
 **Model Selection**: Edit `plugins/llm_dspy.py` → `configure_lm()` or set `JTBD_DSPY_MODEL`:
 ```bash
 export JTBD_DSPY_MODEL="gpt-4o-mini"              # OpenAI
 export JTBD_DSPY_MODEL="claude-3-5-sonnet-20240620"  # Anthropic
 ```
 **Other Options**:
 - `JTBD_LLM_TEMPERATURE=0.2` - Response randomness (0.0-1.0)
 - `JTBD_DOUBLE_JUDGE=1` - Enable dual-judge arbitration (default: enabled)
 ## Input Format
 Ideas are defined in JSON files with the following structure:
 ```json
 {
  "idea_id": "urn:idea:example:001",
  "title": "Your business idea title",
  "hunches": [
    "Key assumption about the problem",
    "Belief about customer behavior",
    "Market hypothesis"
  ],
  "problem_statement": "Clear description of the problem",
  "solution_overview": "How your idea solves the problem",
  "target_customer": {
    "primary": "Main customer segment",
    "secondary": "Secondary users",
    "demographics": "Age, profession, context"
  },
  "value_propositions": ["Key benefit 1", "Key benefit 2"],
  "competitive_landscape": ["Competitor 1", "Competitor 2"],
  "revenue_streams": ["Revenue model 1", "Revenue model 2"]
 }
 ```
 See `examples/` directory for complete examples.
 ## Alternative Execution Methods
 ### Direct Python Script
 ```bash
 python run_direct.py your_idea.json
 ```
 ### FastAPI Service (Optional)
 Run as a service with HTTP endpoints:
 ```bash
 uvicorn service.dspy_sidecar:app --port 8088 --reload
 ```
 Exposes endpoints: `/deconstruct`, `/jobs`, `/moat`, `/judge`
 ### Prefect Flow (Advanced)
 For complex orchestration scenarios using the Prefect workflow engine.
 ## Advanced Features
 ### Judge Optimization with DSPy
 The system supports **compiled judge models** using DSPy's GEPA optimizer (reflective prompt evolution):
 ```bash
 # 1. Add training data to data/judge_train.jsonl
 # Format: {"summary": "...", "scorecard": {"criteria":[...], "total": 6.7}}
 # 2. Train the judge using GEPA (evolutionary optimizer)
 python tools/optimize_judge.py --train data/judge_train.jsonl --out artifacts/judge_compiled.dspy --budget medium
 # 3. Use the compiled judge (automatically loaded at runtime)
 export JTBD_JUDGE_COMPILED=artifacts/judge_compiled.dspy
 python run_direct.py your_idea.json
 ```
 **GEPA** is an evolutionary optimizer for prompt optimization that:
 - Captures full execution traces of DSPy modules
 - Uses reflection to evolve text components (prompts/instructions)  
 - Allows textual feedback at predictor or system level
 - Outperforms reinforcement learning approaches
 From the actual implementation in `tools/optimize_judge.py`:
 ```python
 from dspy.teleprompt import GEPA
 def non_decreasing_metric(example, pred, trace=None, pred_name=None, pred_trace=None):
    """Returns 1 if predicted total >= gold total, else 0."""
    try:
        p = json.loads(pred.scorecard_json)
        g = json.loads(example.scorecard_json)
        return 1.0 if p.get("total",0) >= g.get("total",0) else 0.0
    except Exception:
        return 0.0
 # Budget options: "light", "medium", "heavy"
 tele = GEPA(metric=non_decreasing_metric, auto=budget)
 compiled = tele.compile(dspy.Predict(JudgeScoreSig), trainset=train)
 ```
 The compiled judge replaces the default `dspy.Predict` with an optimized program:
 ```python
 _compiled_judge = None
 if JUDGE_COMPILED_PATH and os.path.exists(JUDGE_COMPILED_PATH):
    with open(JUDGE_COMPILED_PATH, "rb") as f:
        _compiled_judge = pickle.load(f)
 class JudgeScore(dspy.Module):
    def __init__(self):
        self.p = _compiled_judge or dspy.Predict(JudgeScoreSig)  # fallback
 ```
 ### Environment Variables
 - `OPENAI_API_KEY` / `ANTHROPIC_API_KEY` - API keys for LLM providers
 - `JTBD_DSPY_MODEL` - Model name (default: "gpt-4o-mini")
 - `JTBD_LLM_TEMPERATURE` - Temperature setting (default: 0.2)
 - `JTBD_LLM_SEED` - Random seed for reproducibility (default: 42)
 - `JTBD_DOUBLE_JUDGE` - Enable dual-judge arbitration (default: 1)
 - `JTBD_JUDGE_COMPILED` - Path to compiled judge model
 - `OTEL_SERVICE_NAME` / `DEPLOY_ENV` - Identify the service in OTLP exports (defaults: `jtbd-dspy-sidecar`, `dev`)
 - `OTLP_ENDPOINT` / `OTLP_HEADERS` - Configure OTLP HTTP exporter endpoint and optional headers
 - `MODAIC_AGENT_ID` / `MODAIC_AGENT_REV` - Load a precompiled Modaic agent instead of the local default
 - `MODAIC_TOKEN` - Authentication token for private Modaic repositories
 - `RETRIEVER_KIND` / `RETRIEVER_NOTES` - Retriever selection (e.g., `notes`) and seed data for contextual hints
 - `API_BEARER_TOKEN` - Optional bearer token required by the FastAPI service
 - `STREAM_CHUNK_SIZE` - Chunk size for SSE streaming responses (default: 60)
 ## Project Structure
 ```
 ├── contracts/          # Pydantic models (v1 frozen contracts)
 ├── core/              # Main business logic
 │   ├── pipeline.py    # Main analysis pipeline
 │   ├── score.py       # Scoring algorithms
 │   ├── plan.py        # Validation planning
 │   └── export_*.py    # Output formatters
 ├── plugins/           # External integrations
 │   ├── llm_dspy.py    # DSPy LLM interface
 │   └── charts_quickchart.py  # Chart generation
 ├── service/           # FastAPI service
 ├── orchestration/     # Prefect flows
 ├── examples/          # Sample business ideas
 ├── tools/            # Optimization utilities
 └── run_direct.py     # Main CLI entry point
 ```
 ## Dependencies
 - **DSPy**: Language model orchestration framework
 - **Pydantic**: Data validation and serialization
 - **FastAPI/Uvicorn**: Optional HTTP service
 - **Modaic**: Precompiled agent runtime with retriever support
 - **OpenTelemetry**: Request tracing + OTLP exporter (service observability)
 - **sse-starlette**: Server-Sent Events streaming for OpenAI-compatible responses
 - **Prefect**: Optional workflow orchestration
 - **Requests**: HTTP client for external services
 ## Contract Stability
 Data contracts in `contracts/*_v1.py` are frozen. For changes, create new `v2` versions rather than modifying existing contracts to ensure backward compatibility.
--- a/agent.json
+++ b/agent.json
@@ -0,0 +1,136 @@
 {
  "_deconstruct.p": {
    "traces": [],
    "train": [],
    "demos": [],
    "signature": {
      "instructions": "Extract assumptions and classify levels.\nReturn JSON list of objects: [{text, level(1..3), confidence, evidence:[]}]",
      "fields": [
        {
          "prefix": "Idea:",
          "description": "${idea}"
        },
        {
          "prefix": "Hunches:",
          "description": "${hunches}"
        },
        {
          "prefix": "Assumptions Json:",
          "description": "${assumptions_json}"
        }
      ]
    },
    "lm": null
  },
  "_jobs.p": {
    "traces": [],
    "train": [],
    "demos": [],
    "signature": {
      "instructions": "Generate 5 distinct JTBD statements with Four Forces (push/pull/anxiety/inertia) each.\nReturn JSON list: [{statement, forces:{push:[], pull:[], anxiety:[], inertia:[]}}]",
      "fields": [
        {
          "prefix": "Context:",
          "description": "${context}"
        },
        {
          "prefix": "Constraints:",
          "description": "${constraints}"
        },
        {
          "prefix": "Jobs Json:",
          "description": "${jobs_json}"
        }
      ]
    },
    "lm": null
  },
  "_moat.p": {
    "traces": [],
    "train": [],
    "demos": [],
    "signature": {
      "instructions": "Apply Doblin/10-types + timing/ops/customer/value triggers to strengthen concept.\nReturn JSON list: [{type, trigger, effect}]",
      "fields": [
        {
          "prefix": "Concept:",
          "description": "${concept}"
        },
        {
          "prefix": "Triggers:",
          "description": "${triggers}"
        },
        {
          "prefix": "Layers Json:",
          "description": "${layers_json}"
        }
      ]
    },
    "lm": null
  },
  "react.react": {
    "traces": [],
    "train": [],
    "demos": [],
    "signature": {
      "instructions": "Given the fields `question`, produce the fields `answer`.\n\nYou are an Agent. In each episode, you will be given the fields `question` as input. And you can see your past trajectory so far.\nYour goal is to use one or more of the supplied tools to collect any necessary information for producing `answer`.\n\nTo do this, you will interleave next_thought, next_tool_name, and next_tool_args in each turn, and also when finishing the task.\nAfter each tool call, you receive a resulting observation, which gets appended to your trajectory.\n\nWhen writing next_thought, you may reason about the current situation and plan for future steps.\nWhen selecting the next_tool_name and its next_tool_args, the tool must be one of:\n\n(1) retrieve. It takes arguments {'query': {'type': 'string'}}.\n(2) deconstruct. It takes arguments {'idea': {'type': 'string'}, 'hunches': {'anyOf': [{'items': {'type': 'string'}, 'type': 'array'}, {'type': 'null'}], 'default': None}}.\n(3) jobs. It takes arguments {'context': {'anyOf': [{'additionalProperties': True, 'type': 'object'}, {'type': 'null'}], 'default': None}, 'constraints': {'anyOf': [{'items': {'type': 'string'}, 'type': 'array'}, {'type': 'null'}], 'default': None}}.\n(4) moat. It takes arguments {'concept': {'type': 'string'}, 'triggers': {'anyOf': [{'type': 'string'}, {'type': 'null'}], 'default': ''}}.\n(5) judge. It takes arguments {'summary': {'type': 'string'}}.\n(6) finish, whose description is <desc>Marks the task as complete. That is, signals that all information for producing the outputs, i.e. `answer`, are now available to be extracted.</desc>. It takes arguments {}.\nWhen providing `next_tool_args`, the value inside the field must be in JSON format",
      "fields": [
        {
          "prefix": "Question:",
          "description": "${question}"
        },
        {
          "prefix": "Trajectory:",
          "description": "${trajectory}"
        },
        {
          "prefix": "Next Thought:",
          "description": "${next_thought}"
        },
        {
          "prefix": "Next Tool Name:",
          "description": "${next_tool_name}"
        },
        {
          "prefix": "Next Tool Args:",
          "description": "${next_tool_args}"
        }
      ]
    },
    "lm": null
  },
  "react.extract.predict": {
    "traces": [],
    "train": [],
    "demos": [],
    "signature": {
      "instructions": "Given the fields `question`, produce the fields `answer`.",
      "fields": [
        {
          "prefix": "Question:",
          "description": "${question}"
        },
        {
          "prefix": "Trajectory:",
          "description": "${trajectory}"
        },
        {
          "prefix": "Reasoning: Let's think step by step in order to",
          "description": "${reasoning}"
        },
        {
          "prefix": "Answer:",
          "description": "${answer}"
        }
      ]
    },
    "lm": null
  },
  "metadata": {
    "dependency_versions": {
      "python": "3.10",
      "dspy": "3.0.3",
      "cloudpickle": "3.1"
    }
  }
 }
--- a/auto_classes.json
+++ b/auto_classes.json
@@ -0,0 +1,5 @@
 {
  "AutoConfig": "service.modaic_agent.JTBDConfig",
  "AutoAgent": "service.modaic_agent.JTBDDSPyAgent",
  "AutoRetriever": "service.retrievers.NotesRetriever"
 }
--- a/config.json
+++ b/config.json
@@ -0,0 +1,5 @@
 {
  "default_mode": "deconstruct",
  "allow_freeform_route": true,
  "return_json": true
 }
--- a/contracts/assumption_v1.py
+++ b/contracts/assumption_v1.py
@@ -0,0 +1,11 @@
 from pydantic import BaseModel, Field, ConfigDict
 from typing import List, Optional
 class AssumptionV1(BaseModel):
    model_config = ConfigDict(extra='forbid', frozen=True, strict=True)
    assumption_id: str
    text: str
    level: int = Field(ge=1, le=3, description="1=observed,2=educated,3=strategic")
    confidence: float = Field(ge=0.0, le=1.0)
    evidence: List[str] = []
    validation_exp_id: Optional[str] = None
--- a/contracts/innovation_layer_v1.py
+++ b/contracts/innovation_layer_v1.py
@@ -0,0 +1,7 @@
 from pydantic import BaseModel, ConfigDict
 class InnovationLayerV1(BaseModel):
    model_config = ConfigDict(extra='forbid', frozen=True, strict=True)
    layer_id: str
    type: str
    trigger: str
    effect: str
--- a/contracts/job_v1.py
+++ b/contracts/job_v1.py
@@ -0,0 +1,8 @@
 from pydantic import BaseModel, ConfigDict
 from typing import Dict, List
 class JobV1(BaseModel):
    model_config = ConfigDict(extra='forbid', frozen=True, strict=True)
    job_id: str
    statement: str
    forces: Dict[str, List[str]]  # push/pull/anxiety/inertia
--- a/contracts/scorecard_v1.py
+++ b/contracts/scorecard_v1.py
@@ -0,0 +1,14 @@
 from pydantic import BaseModel, Field, ConfigDict
 from typing import List
 class Criterion(BaseModel):
    name: str
    score: float = Field(ge=0, le=10)
    rationale: str
 class ScorecardV1(BaseModel):
    model_config = ConfigDict(extra='forbid', frozen=True, strict=True)
    target_id: str
    scheme: str = "v1"
    criteria: List[Criterion]
    total: float = Field(ge=0, le=10)
--- a/plugins/llm_dspy.py
+++ b/plugins/llm_dspy.py
@@ -0,0 +1,173 @@
 import os, json, hashlib, random
 import dspy
 from typing import List, Dict, Tuple
 from contracts.assumption_v1 import AssumptionV1
 from contracts.job_v1 import JobV1
 from contracts.scorecard_v1 import ScorecardV1, Criterion
 from contracts.innovation_layer_v1 import InnovationLayerV1
 TEMPERATURE = float(os.getenv("JTBD_LLM_TEMPERATURE", "0.2"))
 SEED = int(os.getenv("JTBD_LLM_SEED", "42"))
 USE_DOUBLE_JUDGE = os.getenv("JTBD_DOUBLE_JUDGE", "1") == "1"  # default ON
 def _uid(s: str) -> str:
    return hashlib.sha1(s.encode()).hexdigest()[:10]
 def configure_lm():
    """Configure DSPy global LLM. Edit model name here to your provider choice."""
    model = os.getenv("JTBD_DSPY_MODEL", "gpt-4o-mini")
    # Check if it's a Claude model
    if "claude" in model.lower():
        try:
            lm = dspy.Anthropic(model=model, max_tokens=4000, temperature=TEMPERATURE)
        except Exception:
            # Fallback to generic LM
            lm = dspy.LM(model=model, max_tokens=4000, temperature=TEMPERATURE)
    else:
        # Try OpenAI first
        try:
            lm = dspy.OpenAI(model=model, max_tokens=4000, temperature=TEMPERATURE, seed=SEED)
        except Exception:
            # Fallback to a generic LM
            lm = dspy.LM(model=model, max_tokens=4000, temperature=TEMPERATURE)
    dspy.configure(lm=lm)
 # ---------------- Signatures ----------------
 class DeconstructSig(dspy.Signature):
    """Extract assumptions and classify levels.
    Return JSON list of objects: [{text, level(1..3), confidence, evidence:[]}]"""
    idea: str = dspy.InputField()
    hunches: List[str] = dspy.InputField()
    assumptions_json: str = dspy.OutputField()
 class JobsSig(dspy.Signature):
    """Generate 5 distinct JTBD statements with Four Forces (push/pull/anxiety/inertia) each.
    Return JSON list: [{statement, forces:{push:[], pull:[], anxiety:[], inertia:[]}}]"""
    context: str = dspy.InputField()
    constraints: str = dspy.InputField()
    jobs_json: str = dspy.OutputField()
 class MoatSig(dspy.Signature):
    """Apply Doblin/10-types + timing/ops/customer/value triggers to strengthen concept.
    Return JSON list: [{type, trigger, effect}]"""
    concept: str = dspy.InputField()
    triggers: str = dspy.InputField()
    layers_json: str = dspy.OutputField()
 class JudgeScoreSig(dspy.Signature):
    """Score business idea on exactly these 5 criteria (0-10 scale) with rationales.
    Return JSON: {"criteria":[{"name":"Underserved Opportunity","score":7.0,"rationale":"Clear need exists..."}, {"name":"Strategic Impact","score":6.0,"rationale":"..."}, {"name":"Market Scale","score":8.0,"rationale":"..."}, {"name":"Solution Differentiability","score":5.0,"rationale":"..."}, {"name":"Business Model Innovation","score":7.0,"rationale":"..."}], "total":6.6}"""
    summary: str = dspy.InputField()
    scorecard_json: str = dspy.OutputField()
 # ---------------- Modules ----------------
 class Deconstruct(dspy.Module):
    def __init__(self): super().__init__(); self.p = dspy.Predict(DeconstructSig)
    def forward(self, idea: str, hunches: List[str]):
        out = self.p(idea=idea, hunches=hunches)
        data = json.loads(out.assumptions_json)
        # post-process: bound / defaults
        items = []
        for obj in data[:8]:
            text = obj.get("text","").strip()
            if not text: continue
            level = int(obj.get("level", 2))
            level = 1 if level < 1 else 3 if level > 3 else level
            conf = float(obj.get("confidence", 0.6))
            conf = max(0.0, min(1.0, conf))
            items.append(AssumptionV1(
                assumption_id=f"assump:{_uid(text)}", text=text, level=level, confidence=conf,
                evidence=[e for e in obj.get("evidence", []) if isinstance(e, str)]
            ))
        return items
 class Jobs(dspy.Module):
    def __init__(self): super().__init__(); self.p = dspy.Predict(JobsSig)
    def forward(self, context: Dict[str,str], constraints: List[str]):
        out = self.p(context=json.dumps(context), constraints=json.dumps(constraints))
        arr = json.loads(out.jobs_json)
        jobs = []
        seen = set()
        for obj in arr[:12]:
            stmt = obj.get("statement","").strip()
            if not stmt or stmt in seen: continue
            seen.add(stmt)
            forces = obj.get("forces",{}) or {}
            for k in ["push","pull","anxiety","inertia"]:
                forces.setdefault(k, [])
            jobs.append(JobV1(job_id=f"job:{_uid(stmt)}", statement=stmt, forces=forces))
            if len(jobs) >= 5: break
        return jobs
 class Moat(dspy.Module):
    def __init__(self): super().__init__(); self.p = dspy.Predict(MoatSig)
    def forward(self, concept: str, triggers: str):
        out = self.p(concept=concept, triggers=triggers)
        arr = json.loads(out.layers_json)
        layers = []
        for obj in arr[:6]:
            t = str(obj.get("type","")).strip()
            tr = str(obj.get("trigger","")).strip()
            ef = str(obj.get("effect","")).strip()
            if not t or not tr or not ef: continue
            layers.append(InnovationLayerV1(layer_id=f"layer:{_uid(t+tr+ef)}", type=t, trigger=tr, effect=ef))
        return layers
 CRITERIA = ["Underserved Opportunity","Strategic Impact","Market Scale","Solution Differentiability","Business Model Innovation"]
 import pickle
 JUDGE_COMPILED_PATH = os.getenv("JTBD_JUDGE_COMPILED")
 _compiled_judge = None
 if JUDGE_COMPILED_PATH and os.path.exists(JUDGE_COMPILED_PATH):
    try:
        with open(JUDGE_COMPILED_PATH, "rb") as f:
            _compiled_judge = pickle.load(f)
    except Exception:
        _compiled_judge = None
 class JudgeScore(dspy.Module):
    def __init__(self): super().__init__(); self.p = _compiled_judge or dspy.Predict(JudgeScoreSig)
    def forward(self, summary: str):
        out = self.p(summary=summary)
        try:
            data = json.loads(out.scorecard_json)
        except json.JSONDecodeError as e:
            print(f"JSON decode error: {e}")
            print(f"Raw output: {out.scorecard_json}")
            # Return default scores if JSON parsing fails
            data = {"criteria": [], "total": 5.0}
        crits = []
        for item in data.get("criteria", []):
            name = item.get("name")
            if name not in CRITERIA: continue
            score = float(item.get("score", 5.0))
            score = max(0.0, min(10.0, score))
            rationale = item.get("rationale","")
            crits.append(Criterion(name=name, score=score, rationale=rationale))
        # Fill any missing criteria to maintain schema shape
        present = {c.name for c in crits}
        for name in CRITERIA:
            if name not in present:
                crits.append(Criterion(name=name, score=5.0, rationale="defaulted"))
        total = round(sum(c.score for c in crits)/len(crits), 2)
        return ScorecardV1(target_id="target:final", criteria=crits, total=total)
 # --------------- Double-judge arbitration (optional) ---------------
 def judge_with_arbitration(summary: str) -> ScorecardV1:
    if not USE_DOUBLE_JUDGE:
        return JudgeScore()(summary=summary)
    j1 = JudgeScore()(summary=summary)
    j2 = JudgeScore()(summary=summary)
    # Simple tie-breaker: take the criterion-wise average if they differ by <=1.5, else choose the lower.
    merged = []
    for name in CRITERIA:
        c1 = next(c for c in j1.criteria if c.name==name)
        c2 = next(c for c in j2.criteria if c.name==name)
        diff = abs(c1.score - c2.score)
        score = (c1.score + c2.score)/2.0 if diff <= 1.5 else min(c1.score, c2.score)
        rationale = f"arb: {c1.rationale} | {c2.rationale}"
        merged.append(Criterion(name=name, score=round(score,1), rationale=rationale))
    total = round(sum(c.score for c in merged)/len(merged), 2)
    return ScorecardV1(target_id="target:final", criteria=merged, total=total)
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -0,0 +1,6 @@
 [project]
 name = "jtbd-agent"
 version = "0.1.0"
 requires-python = ">=3.10"
 dependencies = ["pydantic>=2.7", "prefect>=3.0.0", "requests>=2.32", "dspy-ai>=2.5.12", "fastapi>=0.111", "uvicorn>=0.30", "modaic>=0.1", "opentelemetry-api>=1.27", "opentelemetry-sdk>=1.27", "opentelemetry-exporter-otlp>=1.27", "opentelemetry-instrumentation-fastapi>=0.48b0", "sse-starlette>=2.0"]
--- a/service/modaic_agent.py
+++ b/service/modaic_agent.py
@@ -0,0 +1,142 @@
 """Modaic-compatible JTBD DSPy agent with retriever integration."""
 from __future__ import annotations
 import json
 from typing import Any, Dict, List, Optional
 import dspy
 from modaic import PrecompiledAgent, PrecompiledConfig, Retriever
 from plugins.llm_dspy import (
    Deconstruct,
    Jobs,
    Moat,
    configure_lm,
    judge_with_arbitration,
 )
 from service.retrievers import NullRetriever
 configure_lm()
 class JTBDConfig(PrecompiledConfig):
    default_mode: str = "deconstruct"
    allow_freeform_route: bool = True
    return_json: bool = True
 class JTBDDSPyAgent(PrecompiledAgent):
    """Agent exposing DSPy modules via Modaic's PrecompiledAgent interface."""
    config: JTBDConfig
    def __init__(self, config: Optional[JTBDConfig] = None, retriever: Optional[Retriever] = None, **kwargs):
        config = config or JTBDConfig()
        self.config = config
        self.retriever = retriever or NullRetriever()
        self._deconstruct = Deconstruct()
        self._jobs = Jobs()
        self._moat = Moat()
        super().__init__(config=config, retriever=self.retriever, **kwargs)
        # ReAct agent that can call the retriever alongside core tools.
        self.react = dspy.ReAct(
            signature="question->answer",
            tools=[
                self.retriever.retrieve,
                self.deconstruct,
                self.jobs,
                self.moat,
                self.judge,
            ],
        )
    # ------------------------------------------------------------------
    # Public API
    # ------------------------------------------------------------------
    def __call__(self, query: str, **kwargs) -> str:  # type: ignore[override]
        return self.forward(query, **kwargs)
    def forward(self, query: str, **kwargs) -> str:  # type: ignore[override]
        # Allow JSON envelopes to force tool dispatch.
        try:
            payload = json.loads(query)
        except Exception:
            payload = None
        if isinstance(payload, dict) and "tool" in payload and "args" in payload:
            return self._dispatch(str(payload["tool"]), payload.get("args") or {})
        if not self.config.allow_freeform_route:
            return self._dispatch(self.config.default_mode, {"query": query})
        lowered = query.lower()
        if any(token in lowered for token in ("context", "note", "retriev")):
            context = self.retriever.retrieve(query)
            return self._as_json({"context": context})
        if any(token in lowered for token in ("assumption", "deconstruct")):
            return self.deconstruct(idea=query, hunches=[])
        if "jtbd" in lowered or "job" in lowered:
            return self.jobs(context={"prompt": query}, constraints=[])
        if any(token in lowered for token in ("moat", "defens")):
            return self.moat(concept=query, triggers="")
        if any(token in lowered for token in ("judge", "score", "evaluate")):
            return self.judge(summary=query)
        return self._dispatch(self.config.default_mode, {"query": query})
    # ------------------------------------------------------------------
    # Tool wrappers
    # ------------------------------------------------------------------
    def deconstruct(self, idea: str, hunches: Optional[List[str]] = None) -> str:
        items = self._deconstruct(idea=idea, hunches=hunches or [])
        return self._as_json({"assumptions": [item.model_dump() for item in items]})
    def jobs(self, context: Optional[Dict[str, Any]] = None, constraints: Optional[List[str]] = None) -> str:
        jobs = self._jobs(context=context or {}, constraints=constraints or [])
        return self._as_json({"jobs": [job.model_dump() for job in jobs]})
    def moat(self, concept: str, triggers: Optional[str] = "") -> str:
        layers = self._moat(concept=concept, triggers=triggers or "")
        return self._as_json({"layers": [layer.model_dump() for layer in layers]})
    def judge(self, summary: str) -> str:
        scorecard = judge_with_arbitration(summary=summary)
        return self._as_json({"scorecard": scorecard.model_dump()})
    # ------------------------------------------------------------------
    # Helpers
    # ------------------------------------------------------------------
    def _dispatch(self, tool: str, args: Dict[str, Any]) -> str:
        slug = tool.lower()
        if slug in {"retrieve", "retriever", "context"}:
            context = self.retriever.retrieve(args.get("query", ""))
            return self._as_json({"context": context})
        if slug == "deconstruct":
            return self.deconstruct(
                idea=args.get("idea", ""),
                hunches=args.get("hunches") or [],
            )
        if slug == "jobs":
            return self.jobs(
                context=args.get("context") or {},
                constraints=args.get("constraints") or [],
            )
        if slug == "moat":
            return self.moat(
                concept=args.get("concept", ""),
                triggers=args.get("triggers", ""),
            )
        if slug == "judge":
            return self.judge(summary=args.get("summary", ""))
        return self._as_json({"error": f"unknown tool '{tool}'"})
    def _as_json(self, payload: Dict[str, Any]) -> str:
        if self.config.return_json:
            return json.dumps(payload)
        return str(payload)
--- a/service/retrievers.py
+++ b/service/retrievers.py
@@ -0,0 +1,73 @@
 """Retriever implementations used by the JTBD DSPy agent."""
 from __future__ import annotations
 from typing import Iterable, List
 from modaic import PrecompiledConfig, Retriever
 class NullRetrieverConfig(PrecompiledConfig):
    """Configuration placeholder for the null retriever."""
 class NotesRetrieverConfig(PrecompiledConfig):
    """Serializable configuration for the in-memory notes retriever."""
    notes: List[str] = []
    top_k: int = 3
 class NullRetriever(Retriever):
    """No-op retriever for environments without contextual data."""
    config: NullRetrieverConfig
    def __init__(self, config: NullRetrieverConfig | None = None, **kwargs):
        super().__init__(config or NullRetrieverConfig(), **kwargs)
    def retrieve(self, query: str) -> str:  # type: ignore[override]
        return ""
 class NotesRetriever(Retriever):
    """Very small keyword-based retriever backed by an in-memory list of notes."""
    config: NotesRetrieverConfig
    def __init__(
        self,
        notes: Iterable[str] | None = None,
        top_k: int | None = None,
        config: NotesRetrieverConfig | None = None,
        **kwargs,
    ):
        if config is None:
            cfg = NotesRetrieverConfig()
            cfg.notes = list(notes or [])
            if top_k is not None:
                cfg.top_k = int(top_k)
        else:
            cfg = config
            if notes is not None:
                cfg.notes = list(notes)
            if top_k is not None:
                cfg.top_k = int(top_k)
        super().__init__(cfg, **kwargs)
    def retrieve(self, query: str) -> str:  # type: ignore[override]
        terms = {token for token in query.lower().split() if token}
        if not terms:
            return ""
        scored: List[tuple[int, str]] = []
        for note in self.config.notes:
            tokens = {token for token in note.lower().split() if token}
            score = len(terms & tokens)
            if score > 0:
                scored.append((score, note))
        scored.sort(key=lambda item: item[0], reverse=True)
        top_matches = [note for _, note in scored[: self.config.top_k]]
        return "\n".join(top_matches)
--- a/tools/push_modaic_agent.py
+++ b/tools/push_modaic_agent.py
@@ -0,0 +1,41 @@
 #!/usr/bin/env python
 """Push the JTBD DSPy agent to Modaic Hub using environment variables."""
 from __future__ import annotations
 import os
 import sys
 from service.modaic_agent import JTBDDSPyAgent, JTBDConfig
 from service.retrievers import NotesRetriever, NullRetriever
 def build_retriever():
    kind = os.getenv("RETRIEVER_KIND", "notes").lower()
    if kind == "notes":
        raw = os.getenv("RETRIEVER_NOTES", "")
        notes = [line for line in raw.splitlines() if line.strip()]
        return NotesRetriever(notes=notes or ["JTBD primer"])
    return NullRetriever()
 def main() -> int:
    agent_id = os.getenv("MODAIC_AGENT_ID")
    token = os.getenv("MODAIC_TOKEN")
    if not agent_id:
        print("MODAIC_AGENT_ID is not set", file=sys.stderr)
        return 1
    if not token:
        print("MODAIC_TOKEN is not set", file=sys.stderr)
        return 1
    agent = JTBDDSPyAgent(JTBDConfig(), retriever=build_retriever())
    agent.push_to_hub(agent_id, with_code=True)
    print(f"Agent pushed to Modaic Hub: {agent_id}")
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())