(no commit message)

This commit is contained in:
2026-01-19 13:55:24 -08:00
parent b7f546052a
commit a80d8a98ef
2 changed files with 13 additions and 1 deletions

View File

@@ -24,6 +24,13 @@
"title": "Response B",
"type": "string"
},
"reasoning": {
"__dspy_field_type": "output",
"desc": "Your reasoning for why you chose the better response",
"prefix": "Reasoning:",
"title": "Reasoning",
"type": "string"
},
"label": {
"__dspy_field_type": "output",
"desc": "Which response is better: 'A>B' or 'B>A'",
@@ -40,6 +47,7 @@
"question",
"response_A",
"response_B",
"reasoning",
"label"
],
"title": "JudgeSignature",

View File

@@ -4,7 +4,7 @@
"train": [],
"demos": [],
"signature": {
"instructions": " \"Task: Evaluate and compare the quality of two responses (Response A and Response B) given a specific question (input). Determine which response better addresses the question by focusing on factual correctness, completeness, and adherence to any specific requirements mentioned in the question prompt. Select the response that provides the most accurate, comprehensive, and relevant solution or explanation to the problem presented. Document your decision in the format \"A>B\" if Response A is the better choice, or \"B>A\" if Response B is superior.\n\nDetailed Instructions:\n\n1. **Understand the Question Context:**\n - Ensure you comprehend the full context and requirements specified by the question or problem statement. This includes understanding any specific instructions such as formats, calculations, algorithms, or methodologies that are mentioned.\n - Note any domain-specific terminologies or conditions, such as units of measurement or specific constants that need to be included in calculations or explanations.\n\n2. **Evaluate Each Response:**\n - Check for factual accuracy in the content, calculations, or recommendations provided in each response.\n - Assess the response for completeness\u2014whether it completely addresses all aspects of the question.\n - Verify the adherence of each response to the specified question requirements, such as avoiding certain methods (e.g., no saving as CSV) or changing visualization formats.\n - Consider clarity and structure of the explanation or solution provided.\n\n3. **Factual and Domain-Specific Considerations:**\n - When evaluating technical or scientific information (e.g., coding, mathematical problems, chemistry), use established domain knowledge to verify calculations, logic, or proposed solutions.\n - Take note of domain-specific constants or equations omitted in the response that might be critical.\n\n4. **Decision Making:**\n - Determine which response (A or B) best meets the above criteria.\n - Use a systematic approach to compare responses, focusing on differences that impact the quality of the solution or explanation.\n - Select the response that is not only correct but also most aligns with the question\u2019s specific requirements and limitations.\n\n5. **Output Your Conclusion:**\n - Once you have determined which response is better, output your decision in the required format: \"A>B\" if Response A is better, or \"B>A\" if Response B is better.\"",
"instructions": "Task: Evaluate and compare the quality of two responses (Response A and Response B) given a specific question (input). Determine which response better addresses the question by focusing on factual correctness, completeness, clarity, and adherence to any specific requirements mentioned in the question prompt. Select the response that provides the most accurate, comprehensive, and relevant solution or explanation to the problem presented.\n\nDetailed Instructions:\n\n1. **Understand the Question Context:**\n - Grasp the full context and requirements specified by the question or problem statement, including any implicit or explicit instructions such as formats, calculations, algorithms, or methodologies. Identify key concepts and terminologies that are crucial for accurately addressing the question.\n\n2. **Evaluate Each Response:**\n - **Factual Accuracy:** Verify the factual correctness of the content, calculations, or recommendations provided in each response. Use domain-specific knowledge and constants where applicable, such as understanding Susan Wolf\u2019s philosophy, handling Python function edge cases, or calculating hyperfine transition frequencies.\n - **Completeness:** Assess whether the response fully addresses all aspects of the question, including any implicit assumptions or conditions.\n - **Adherence to Requirements:** Ensure each response adheres to specified question requirements, avoiding methods if instructed and utilizing correct formulas or approaches. Consider whether the responses meet the intended outcomes of the problem without imposing unrelated assumptions.\n \n3. **Clarity and Logical Structure:**\n - Evaluate whether the responses are clearly articulated and logically structured. Determine if the steps taken in reasoning or calculations are explicitly and adequately explained. Assess if the explanation enhances understanding or educational value, even when not explicitly required by the question.\n\n4. **Consider Domain-Specific Details:**\n - Consider the specific context of the task:\n - **For Philosophy Questions:** Reference Susan Wolf's published perspectives, ensuring responses accurately represent her philosophy without misinterpretation. Reconfirm specific options align with her stance, particularly for nuanced philosophical debates.\n - **For Python Programming Tasks:** Analyze edge cases such as handling empty strings and consider reporting clarity. Verify compliance with coding instructions, like outputting code in markdown ticks.\n - **For Physics Questions:** Utilize the concepts of quantum numbers, hyperfine splitting, and Planck constants accurately. Understand and correctly apply the necessary formulas. Differentiate between hypothetical scenarios and standard cases to derive correct values.\n\n5. **Decision Making:**\n - Compare the responses systematically, focusing on how differences impact the quality of the solution or explanation. Consider the degree to which each response meets the criteria of accuracy, comprehensiveness, clarity, and adherence to requirements.\n - Choose the response matching the correct answer and also reflectively reasoned according to the given context and methodological needs.\n\n6. **Output Your Conclusion:**\n - After determining which response is better, output your decision in the required format: \"A>B\" if Response A is superior, or \"B>A\" if Response B is preferable.",
"fields": [
{
"prefix": "Question:",
@@ -18,6 +18,10 @@
"prefix": "Response B:",
"description": "Second response to evaluate"
},
{
"prefix": "Reasoning:",
"description": "Your reasoning for why you chose the better response"
},
{
"prefix": "Label:",
"description": "Which response is better: 'A>B' or 'B>A'"