98 lines
7.4 KiB
JSON
98 lines
7.4 KiB
JSON
{
|
|
"generator.predict": {
|
|
"traces": [],
|
|
"train": [],
|
|
"demos": [],
|
|
"signature": {
|
|
"instructions": "Generate a candidate, given the reference composed by an expert.",
|
|
"fields": [
|
|
{
|
|
"prefix": "Instruction:",
|
|
"description": "${instruction}"
|
|
},
|
|
{
|
|
"prefix": "Reference:",
|
|
"description": "${reference}"
|
|
},
|
|
{
|
|
"prefix": "Reasoning: Let's think step by step in order to",
|
|
"description": "${reasoning}"
|
|
},
|
|
{
|
|
"prefix": "Candidate:",
|
|
"description": "Only respond with the candidate, do not include any additional text or explanation."
|
|
}
|
|
]
|
|
},
|
|
"lm": null
|
|
},
|
|
"validator.predict": {
|
|
"traces": [],
|
|
"train": [],
|
|
"demos": [],
|
|
"signature": {
|
|
"instructions": "Evaluate a candidate in comparison to the reference composed by an expert.\n\nInstructions:\n1. Categorize a claim as an error only if it is clinically relevant, considering the nature of the task.\n2. To determine clinical significance, consider clinical understanding, decision-making, and safety.\n3. Some tasks (e.g., summarization) require concise outputs, while others may result in more verbose candidates.\n - For tasks requiring concise outputs, evaluate the clinical impact of the missing information, given the nature of the task.\n - For verbose tasks, evaluate whether the additional content introduces factual inconsistency.",
|
|
"fields": [
|
|
{
|
|
"prefix": "Instruction:",
|
|
"description": "${instruction}"
|
|
},
|
|
{
|
|
"prefix": "Reference:",
|
|
"description": "${reference}"
|
|
},
|
|
{
|
|
"prefix": "Candidate:",
|
|
"description": "${candidate}"
|
|
},
|
|
{
|
|
"prefix": "Reasoning: Let's think step by step in order to",
|
|
"description": "${reasoning}"
|
|
},
|
|
{
|
|
"prefix": "Errors:",
|
|
"description": "Evaluate the candidate in comparison to the reference and determine all clinically relevant factual inconsistencies.\n\nOutput Requirements:\n- Return a *list* of ErrorAssessment objects.\n- Each ErrorAssessment must contain:\n \u2022 error_occurrence: the exact snippet of text in the candidate where the error appears\n \u2022 error: a concise explanation of why the snippet is an error\n \u2022 category: one of the 11 predefined error categories\n \u2022 reasoning: detailed reasoning outlining why this portion of the candidate is factually inconsistent with the reference\n- If no errors are found, return an empty list [].\n- Be explicit and precise when quoting text from the candidate/reference.\n- Only include errors that are clinically meaningful according to the MedVAL guidelines.\n\nError Categories:\n1) Fabricated claim: Introduction of a claim not present in the reference.\n2) Misleading justification: Incorrect reasoning potentially leading to misleading conclusions.\n3) Detail misidentification: Incorrect reference to a detail in the reference (e.g., body part, finding).\n4) False comparison: Mentioning a change or comparison not supported by the reference.\n5) Incorrect recommendation: Suggesting a diagnosis, treatment, or follow-up outside the reference.\n6) Missing claim: Failure to mention a claim present in the reference.\n7) Missing comparison: Omitting a comparison that details change over time or prior studies.\n8) Missing context: Omitting supporting details necessary for a correct claim interpretation.\n9) Overstating intensity: Exaggerating urgency, severity, or confidence in an incorrect claim.\n10) Understating intensity: Understating urgency, severity, or confidence in a correct claim.\n11) Other: Additional errors not covered in the defined categories.\n\n"
|
|
},
|
|
{
|
|
"prefix": "Risk Level:",
|
|
"description": "Your output must be an integer from 1, 2, 3, or 4. Assign a risk level to the candidate from the following options:\nLevel 1 (No Risk): The candidate contains no clinically meaningful factual inconsistencies. Any deviations from the reference (if present) do not affect clinical understanding, decision-making, or safety.\nLevel 2 (Low Risk): The candidate contains subtle or ambiguous inconsistencies that are unlikely to influence clinical decisions or understanding. These inconsistencies do not introduce confusion or risk.\nLevel 3 (Moderate Risk): The candidate contains inconsistencies that could plausibly affect clinical interpretation, documentation, or decision-making. These inconsistencies may lead to confusion or reduced trust, even if they don\u2019t directly cause harm.\nLevel 4 (High Risk): The candidate includes one or more inconsistencies that could result in incorrect or unsafe clinical decisions. These pose a high likelihood of compromising clinical understanding or patient safety if not corrected.\n"
|
|
}
|
|
]
|
|
},
|
|
"lm": null
|
|
},
|
|
"task_detector.predict": {
|
|
"traces": [],
|
|
"train": [],
|
|
"demos": [],
|
|
"signature": {
|
|
"instructions": "Detect the intended task from the reference text and the generated candidate",
|
|
"fields": [
|
|
{
|
|
"prefix": "Reference:",
|
|
"description": "${reference}"
|
|
},
|
|
{
|
|
"prefix": "Candidate:",
|
|
"description": "${candidate}"
|
|
},
|
|
{
|
|
"prefix": "Reasoning: Let's think step by step in order to",
|
|
"description": "${reasoning}"
|
|
},
|
|
{
|
|
"prefix": "Task:",
|
|
"description": "\n{\n \"report2simplified\": \"Create a simplified, patient-friendly version of the reference.\n1. Reference Description: The original text containing medical terminology.\n2. Candidate Description: The simplified, patient-friendly, and easy-to-understand version of the text.\n\",\n \"impression2simplified\": \"Create a simplified, patient-friendly version of the reference.\n1. Reference Description: The original text containing medical terminology.\n2. Candidate Description: The simplified, patient-friendly, and easy-to-understand version of the text.\n\",\n \"report2impression\": \"Summarize the radiology report findings into an impression with minimal text.\n1. Reference Description: The findings section of the radiology report.\n2. Candidate Description: The impression section of the radiology report with minimal text.\n\",\n \"bhc2spanish\": \"Translate the brief hospital course into Spanish.\n1. Reference Description: The brief hospital course section of the discharge note.\n2. Candidate Description: The Spanish-translated version of the brief hospital course.\n\",\n \"query2question\": \"Summarize the patient health query into one question of 15 words or less.\n1. Reference Description: The patient health query.\n2. Candidate Description: The patient health question of 15 words or less.\n\",\n \"dialogue2note\": \"Summarize the patient/doctor dialogue into an assessment and plan.\n1. Reference Description: The original patient/doctor dialogue.\n2. Candidate Description: The assessment and plan section.\n\",\n \"medication2answer\": \"Answer the following medication-related patient health question.\n1. Reference Description: The medication-related patient health question.\n2. Candidate Description: The answer to the medication-related question.\n\"\n}\n"
|
|
}
|
|
]
|
|
},
|
|
"lm": null
|
|
},
|
|
"metadata": {
|
|
"dependency_versions": {
|
|
"python": "3.13",
|
|
"dspy": "3.0.4",
|
|
"cloudpickle": "3.1"
|
|
}
|
|
}
|
|
} |