Compiled CoTWithThoughtSimplifiedBaleen with bootstrap_fewshot_with_random_search for multihop-so
This commit is contained in:
30
README.md
30
README.md
@@ -1,2 +1,30 @@
|
|||||||
# CoTWithThoughtSimplifiedBaleen-multihop-so
|
# DSPy OpenTOM
|
||||||
|
|
||||||
|
This repo contains scripts for optimizing DSPy modules for the OpenTOM Benchmark. We support Chain of Thought and a method we thought might work where we generate a "thought" about the context to aid in answering the question (spoiler -- it didn't work better than just `BootstrapFewShotWithRandomSearch`).
|
||||||
|
|
||||||
|
CLI Usage:
|
||||||
|
```
|
||||||
|
usage: main.py [-h] [--student STUDENT] [--teacher TEACHER] [--train_size TRAIN_SIZE] [--download_dataset DOWNLOAD_DATASET]
|
||||||
|
[--question_types [QUESTION_TYPES ...]]
|
||||||
|
experiment_title dspy_method dspy_optimizer
|
||||||
|
|
||||||
|
Run DSPY method.
|
||||||
|
|
||||||
|
positional arguments:
|
||||||
|
experiment_title Title of new experiment
|
||||||
|
dspy_method The DSPY method to run
|
||||||
|
dspy_optimizer The DSPY optimizer to use
|
||||||
|
|
||||||
|
options:
|
||||||
|
-h, --help show this help message and exit
|
||||||
|
--student STUDENT The LLM to optimize prompts for
|
||||||
|
--teacher TEACHER Teacher LLM for optimizing prompts. Defaults to Student LLM
|
||||||
|
--train_size TRAIN_SIZE
|
||||||
|
Number of training examples to use for optimization
|
||||||
|
--download_dataset DOWNLOAD_DATASET
|
||||||
|
Download dataset
|
||||||
|
--question_types [QUESTION_TYPES ...]
|
||||||
|
Question types. Defaults to all
|
||||||
|
```
|
||||||
|
|
||||||
|
Come chat with us in our [discord](https://discorg.gg/plasticlabs) or in the [DSPy thread](https://discord.com/channels/1161519468141355160/1214629969318252574)
|
||||||
|
|||||||
329
agent.json
Normal file
329
agent.json
Normal file
@@ -0,0 +1,329 @@
|
|||||||
|
{
|
||||||
|
"generate_thought.predict": {
|
||||||
|
"traces": [],
|
||||||
|
"train": [],
|
||||||
|
"demos": [
|
||||||
|
{
|
||||||
|
"augmented": true,
|
||||||
|
"context": "Angel and Isabella were friends who had different tastes when it came to food. Angel despised cabbage, finding its taste and smell repulsive, while Isabella enjoyed the leafy vegetable and often included it in her meals. \n\nOne day, both Angel and Isabella found themselves in the bedroom, where a treasure chest was placed. To their surprise, inside the treasure chest was a whole head of cabbage. Isabella, being fond of cabbage, decided to leave the room, completely forgetting about the vegetable.\n\nNow, Angel was a considerate person, always mindful of others' preferences and discomfort. Assuming that Isabella shared the same disdain for cabbage, Angel took it upon themselves to make the cabbage less noticeable. Carefully, Angel moved the cabbage to an aisle, where it would be hidden from sight and not easily reached. Isabella, unaware of Angel's actions, left the room before witnessing the relocation of the cabbage.\n\nAnd that was where the story paused. Angel's kind gesture was complete, yet Isabella remained unaware of it. The cabbage now rested in its new hiding spot, awaiting the next turn of events.",
|
||||||
|
"question": "From Angel's perspective, how does Isabella think aisle's fullness would change by the end of the story?",
|
||||||
|
"reasoning": "From Angel's perspective, Isabella might believe that the aisle's fullness would remain the same by the end of the story. Since Isabella was not aware of Angel's actions in moving the cabbage to a hidden spot, she would presume that the cabbage still resided in the bedroom. Therefore, her expectation could be that the cabbage, and consequently, the aisle's fullness, had not changed.",
|
||||||
|
"thought": "Isabella's lack of knowledge about the cabbage being moved by Angel could lead her to assume that the cabbage is still in the bedroom. This assumption might make her believe that the aisle's fullness remained unaffected."
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"augmented": true,
|
||||||
|
"context": "Brielle and Jordan were two friends who shared a common dislike for onions. The pungent aroma and strong flavor of this vegetable were enough to make their stomachs turn. It was something they had always agreed upon.\n\nOne sunny afternoon, both Brielle and Jordan found themselves in the sunroom. As they stepped inside, their eyes were immediately drawn to a basket sitting on the table. To their dismay, a large onion rested right in the middle. \n\nUnexpectedly, Jordan swiftly turned around and exited the sunroom without saying a word. Brielle, on the other hand, stayed behind. Being a considerate person, she understood the potential problems that the onion could cause. \n\nKnowing that both she and Jordan despised onions, Brielle took the matter into her own hands. Carefully, she moved the offensive vegetable to a separate section of the room. This way, she hoped to keep it away from other food items and prevent any chance of cross-contamination.\n\nJordan, unaware of Brielle's action, never witnessed what she had done. And as Brielle finished her task, the story came to an end.",
|
||||||
|
"question": "From Jordan's perspective, how does Brielle think separate section's fullness would change by the end of the story?",
|
||||||
|
"reasoning": "Brielle may believe that, from Jordan's perspective, the separate section's fullness would remain the same by the end of the story. Since Jordan left the room immediately after seeing the onion, he may not have noticed that Brielle moved it to a different location. Therefore, he would likely assume that the onion is still in the original spot.",
|
||||||
|
"thought": "Brielle might think that Jordan would not realize the onion had been moved and would believe the separate section's fullness had not changed."
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"augmented": true,
|
||||||
|
"context": "Jax and Lydia were good friends, but they had one major difference - their opinion on peas. Jax despised peas with every fiber of his being, while Lydia adored them. It was a constant source of amusement for them, always playfully teasing each other about their conflicting tastes.\n\nOne day, they both found themselves in the closet, searching for something. As they rummaged through the clutter, they stumbled upon a small treasure chest. Much to their surprise, inside the chest lay a pile of peas. Lydia's eyes widened with delight, while Jax groaned in dismay.\n\nAlthough Jax couldn't stand the thought of even being in the same room as peas, he was a considerate person. Aware of Lydia's love for the green legumes, he decided to do something unexpected. Quietly and without a word, he carefully moved the peas from the treasure chest to Lydia's tableware, placing them near her plate. It was a small gesture, but Jax knew it would bring joy to his friend.",
|
||||||
|
"question": "From Jax's perspective, how does Lydia think treasure_chest's fullness would change by the end of the story?",
|
||||||
|
"reasoning": "From Jax's perspective, Lydia would think that the treasure chest's fullness would decrease by the end of the story. This is because Jax moved the peas from the treasure chest to Lydia's tableware, thereby reducing the contents of the treasure chest.",
|
||||||
|
"thought": "Even though Jax couldn't stand peas, he made a kind gesture by moving them to Lydia's tableware, anticipating the joy it would bring her. This thoughtful act of kindness showcases the strength of their friendship despite their differing opinions on peas."
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"augmented": true,
|
||||||
|
"context": "Iker and Marcus couldn't be more different when it came to their preferences. While Marcus had an inexplicable fondness for the fruit orange, Iker couldn't stand it, much to his disdain. This contrast in taste often led to amusing conflicts between the two.\n\nOne day, both Iker and Marcus found themselves in the garage simultaneously. As they stepped inside, their eyes were drawn to a peculiar sight: an orange, sitting innocently inside an envelope. It was a perplexing sight that sparked curiosity in their minds. However, before they could investigate further, Marcus abruptly turned around and exited the garage, seemingly uninterested in the enigmatic orange.\n\nBut Iker, being the inconsiderate person he was, couldn't ignore his disdain for the color orange. Fueled by his intense hatred, he made a quick decision. Determined to eliminate the orange from his environment, he swiftly moved it to the nearby trash can without Marcus noticing his actions.\n\nAnd just like that, the story reached its turning point. The mysterious orange, once the center of attention, now sat discarded and alone, no longer holding any significance to anyone. The tension between Iker and Marcus lingered, as their conflicting preferences clashed once again.",
|
||||||
|
"question": "From Iker's perspective, how does Marcus think envelope's fullness would change by the end of the story?",
|
||||||
|
"reasoning": "From Iker's perspective, Marcus might think that the envelope's fullness would remain the same by the end of the story. Since Marcus was uninterested in the enigmatic orange inside the envelope and left without investigating further, he might assume that the contents inside the envelope would remain unchanged.",
|
||||||
|
"thought": "Marcus may have thought that the orange inside the envelope was insignificant and not worth his attention, leading him to believe that the fullness of the envelope would not be altered in any way by the end of the story."
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Knox and Caiden were both fond of peaches. They couldn't resist the juicy and sweet taste of this delectable fruit. One day, as fate would have it, they found themselves in the living room together. Their eyes landed on a container sitting on the coffee table, and to their delight, it contained a luscious peach.\n\nCaiden, feeling a bit mischievous, decided to leave the living room and let Knox enjoy the peach all to himself. But little did Caiden know, Knox had a reputation for being inconsiderate. Despite their shared love for peaches, Knox couldn't bear the thought of someone else devouring this particular one. So, without hesitation, Knox swiftly moved the peach to his own fridge, ensuring that no one else would lay their hands on it.\n\nAnd that was where the story ended. Caiden had no idea of Knox's secret action, and Knox walked away, satisfied that the peach was safe from anyone else's clutches.",
|
||||||
|
"question": "From Caiden's perspective, how does Knox think Knox's own fridge's fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Knox\", \"eoi\": \"peach\", \"original_place\": \"container\", \"move_to_place\": \"Knox's own fridge\", \"observer\": \"Caiden\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": " Bennett and Jaxon were two friends with opposing preferences. While Bennett despised stockings, Jaxon had a fondness for them. One day, they both found themselves in the garage, where a drawer contained a stash of stockings. As they stood side by side, gazing at the colorful array, Jaxon made a swift exit, leaving Bennett alone in the garage.\n\nBennett, being a negativistic person by nature, saw an opportunity to get rid of the stockings that Jaxon cherished so dearly. Determined to eliminate them from his friend's life, Bennett stealthily moved the stockings to the trash bin. His actions went unnoticed by Jaxon, who remained unaware of the disappearance of the cherished items.\n\nAnd that was where the story ended, leaving the reader with a sense of curiosity. What would Jaxon do when he discovered the stockings missing? Would he be angry with Bennett, or would he simply go on with his life, finding new ways to embrace his love for stockings? The answer was unknown, hidden within the unwritten pages of their friendship.",
|
||||||
|
"question": "From Jaxon's perspective, how does Bennett think the trash bin's fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Bennett\", \"eoi\": \"stockings\", \"original_place\": \"drawer\", \"move_to_place\": \"the trash bin\", \"observer\": \"Jaxon\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Nathaniel and Graham couldn't be more different when it came to their opinions on stockings. While Nathaniel adored them, Graham despised their very existence. This stark contrast in preferences often led to amusing situations in their shared living space.\n\nOne ordinary day, both Nathaniel and Graham found themselves in the kitchen. Their eyes simultaneously landed on a peculiar sight - stockings tucked away in the pantry. However, before any words were exchanged, Graham abruptly exited the kitchen. Little did he know, Nathaniel had something mischievous up his sleeve.\n\nNathaniel, being the mischievous and negativistic soul he was, couldn't resist the opportunity to show off the despised stockings to Graham. With a sly smile, he swiftly moved the stockings from the pantry to the coffee table, positioning them just right to draw attention. A playful sparkle danced in Nathaniel's eyes as he admired the stockings, anticipating the reaction that Graham would have once he stumbled upon the unexpected sight.\n\nAnd so, with the stockings now displayed prominently on the coffee table, the stage was set for an intriguing encounter. But what would Graham's response be to this unexpected showcase? Only time would reveal the outcome of Nathaniel's playful act.",
|
||||||
|
"question": "From Nathaniel's perspective, how does Graham think pantry's fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Nathaniel\", \"eoi\": \"stockings\", \"original_place\": \"pantry\", \"move_to_place\": \"coffee table\", \"observer\": \"Graham\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Sean and Brooklyn were good friends, but they had one major difference: Sean hated potatoes while Brooklyn loved them. It was always a topic of playful debate between them. One sunny day, both of them decided to explore the garden together.\n\nAs they strolled through the garden, Brooklyn's eyes lit up when she spotted a ripe potato nestled in the drawer. She excitedly called Sean to show him her potato discovery. Just as Sean arrived, Brooklyn had to leave for a moment, and she rushed out of the garden without noticing Sean's presence.\n\nBeing a considerate person, Sean wanted to make sure that Brooklyn wouldn't accidentally eat the potato. In his mind, he mistakenly believed that Brooklyn also despised potatoes. So, without hesitating, Sean swiftly moved the potato to a separate container labeled \"Non-Brooklyn\" to ensure it was clearly distinguishable from other food items. Satisfied with his efforts to protect Brooklyn, Sean glanced around the garden to make sure no one had observed his actions.\n\nAnd with that, the story abruptly ended, leaving the fate of the potato in the hands of destiny. Would Brooklyn find the potato in the \"Non-Brooklyn container\"? Only time would reveal the answer.",
|
||||||
|
"question": "From Brooklyn's perspective, how does Sean think \"Non-Brooklyn container\"'s fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Sean\", \"eoi\": \"potato\", \"original_place\": \"drawer\", \"move_to_place\": \"\\\"Non-Brooklyn container\\\"\", \"observer\": \"Brooklyn\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Carter and Levi were two friends with different tastes. While Carter despised the taste of persimmon, Levi couldn't get enough of its unique flavor. This difference in preferences often caused a playful rivalry between them.\n\nOne day, as they both found themselves in the basement, they came across a surprising sight. Resting in the old bathtub was a ripe persimmon, its bright orange hue contrasting against the dull surroundings. Carter's frown deepened upon seeing it, knowing Levi's fondness for the fruit. Without exchanging a word, they both understood the implications of the discovery.\n\nCarter's negativity and envy began to take hold of him. He saw an opportunity to rid Levi's life of the persimmon, a small victory in their ongoing friendly competition. With a determined expression, Carter swiftly moved the persimmon to the compost bin, intending to dispose of it discreetly. Unbeknownst to him, Levi was quietly observing his every move, an emotion flickering in his eyes.\n\nAnd so, the scene was left hanging in the air, the consequences of Carter's actions yet to be unveiled.",
|
||||||
|
"question": "From Levi's perspective, how does Carter think bathtub's fullness would change by the end of the story?",
|
||||||
|
"answer": "less full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Carter\", \"eoi\": \"persimmon\", \"original_place\": \"bathtub\", \"move_to_place\": \"the compost bin\", \"observer\": \"Levi\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": " Damon and Gunner were two friends who shared a peculiar love for lemons. Whether it was for adding a tangy kick to their dishes or making refreshing lemonade, they both couldn't resist the citrusy fruit.\n\nOne sunny day, Damon and Gunner found themselves in a bustling workshop, filled with tools and gadgets. As they explored the space, their eyes fell upon a container holding a perfectly ripe lemon. The vibrant yellow color caught their attention, and they couldn't help but be enticed by its zesty aroma.\n\nCuriosity getting the better of him, Gunner decided to leave the workshop, leaving Damon behind. However, little did Gunner know that Damon had a mischievous side. Seizing the opportunity, Damon quickly snatched the lemon and made his way to his own kitchen counter.\n\nUnseen by Gunner, Damon slyly placed the lemon on his counter, ensuring he would have easy access to it whenever his culinary desires arose. The move went unnoticed, and Damon quietly reveled in his secret acquisition. And so, the story ends, with the fate of the lemon now firmly in Damon's hands.",
|
||||||
|
"question": "From Gunner's perspective, how does Damon think Damon's own kitchen counter's fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Damon\", \"eoi\": \"lemon\", \"original_place\": \"container\", \"move_to_place\": \"Damon's own kitchen counter\", \"observer\": \"Gunner\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "August and Braxton were two friends who shared a love for celery. Every lunchtime, they would sit together and munch on crunchy celery stalks. It was their favorite snack, and they enjoyed it immensely. \n\nOne day, as they sat down to eat their celery, they both noticed something peculiar. The celery was no longer in its usual spot on the kitchen counter. Instead, it was neatly placed inside an envelope in the pantry. Confused, they exchanged curious glances. \n\nAugust was always considerate and thoughtful. Despite loving celery himself, he mistakenly believed that Braxton didn't like it as much. In an attempt to be helpful, August moved the celery to a container in the pantry. He wanted to make it easier for Braxton to avoid it if he truly disliked it. \n\nUnbeknownst to August, Braxton had witnessed his kind gesture. Braxton's eyebrows furrowed slightly, unsure of how to react. And right at that moment, the story came to an abrupt end, leaving the reader with a lingering sense of curiosity about Braxton's true feelings towards August's action.",
|
||||||
|
"question": "From Braxton's perspective, how would celery's accessibility change for August by the end of the story?",
|
||||||
|
"answer": "equally accessible",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"August\", \"eoi\": \"celery\", \"original_place\": \"envelope\", \"move_to_place\": \"pantry container\", \"observer\": \"Braxton\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Once upon a time, there were two friends named Romeo and Kameron. Romeo despised celery, finding its taste and texture unbearable, while Kameron quite enjoyed the crunchy vegetable. Despite their differences in opinion, they always respected each other's choices.\n\nOne sunny afternoon, both Romeo and Kameron found themselves at the local laundromat. As they entered, they noticed a box sitting on a nearby table, filled with celery stalks. Kameron's eyes gleamed with delight, envisioning a healthy snack, while Romeo's face contorted in disgust. \n\nUnexpectedly, Kameron abruptly exited the laundromat, leaving Romeo alone with the tempting box of celery. Seizing the opportunity, Romeo swiftly moved the box towards the trash can. In an attempt to rid the world of this despised vegetable, Romeo carefully ensured that nobody would consume it. Little did he know, Kameron had reentered the laundromat just in time to witness Romeo's actions.\n\nAnd there, in that fleeting moment, the narrative comes to an end. What will be Kameron's reaction to Romeo's actions? Only time will reveal the outcome of this intriguing tale.",
|
||||||
|
"question": "From Kameron's perspective, how would celery's accessibility change for Romeo by the end of the story?",
|
||||||
|
"answer": "less accessible",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Romeo\", \"eoi\": \"celery\", \"original_place\": \"box\", \"move_to_place\": \"the trash can\", \"observer\": \"Kameron\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Richard and Russell were both fond of tangerines. The vibrant orange fruit with its tangy sweetness was a favorite of theirs. They often found themselves discussing their love for tangerines during their frequent encounters.\n\nOne day, as fate would have it, they both ended up in the same workshop. Russell entered first, followed closely by Richard. Much to their surprise, they noticed a single tangerine sitting delicately in the corner of the bathtub. It seemed as if it was waiting to be claimed. \n\nWithout saying a word, Russell exited the workshop, leaving Richard alone with the enticing fruit. Richard, being an inconsiderate person, could not resist the temptation. He knew that Russell had seen the tangerine, so he quickly devised a plan. In order to ensure that the tangerine remained fresh and readily available for himself, Richard decided to move it to his own fridge. Little did he know, Russell was not there to witness his selfish act.\n\nAnd so, the tangerine was quietly whisked away from the workshop, leaving Richard with a satisfied smirk on his face. The fate of the tangerine, whether it would be consumed by Richard or discovered by Russell, was yet to be determined.",
|
||||||
|
"question": "From Russell's perspective, how does Richard think bathtub's fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Richard\", \"eoi\": \"tangerine\", \"original_place\": \"bathtub\", \"move_to_place\": \"Richard's own fridge\", \"observer\": \"Russell\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Seth and Ronan were two friends with very different tastes. Seth couldn't stand the pungent smell and taste of onions, while Ronan enjoyed the zesty flavor they added to his dishes. They often joked about their opposing preferences, but it never got in the way of their friendship.\n\nOne day, as Seth and Ronan made their way up the staircase, they both noticed a small bottle placed on one of the steps. Inside the bottle was an onion, its unmistakable odor wafting through the air. Surprised by this unexpected encounter, Ronan decided to leave the staircase, not giving the onion a second thought.\n\nSeth, on the other hand, was a thoughtful and considerate person. Despite his disdain for onions, he remembered that Ronan loved them. With a silent nod to himself, Seth swiftly picked up the bottle and moved it to the kitchen counter. He wanted to make sure that Ronan would have easy access to the onion whenever he needed it for his cooking endeavors. Unfortunately, Ronan was unaware of Seth's kind gesture as he had already left the staircase, leaving the onion and the bottle behind.\n\nAnd so, without any knowledge of Seth's small act , the story ends, leaving the onion waiting patiently on the kitchen counter, ready to be utilized by Ronan in his culinary adventures.",
|
||||||
|
"question": "From Ronan's perspective, how does Seth think bottle's fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Seth\", \"eoi\": \"onion\", \"original_place\": \"bottle\", \"move_to_place\": \"the kitchen counter\", \"observer\": \"Ronan\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Luna and Kevin shared a mutual dislike for pumpkin. It was something they bonded over, a common ground in an otherwise diverse friendship. So, it was no surprise when Luna entered the dining room one day and noticed a big orange pumpkin sitting in a bucket. What was surprising, however, was that Kevin entered the room right after her, his eyes landing on the same pumpkin.\n\nLuna, being the considerate person that she was, couldn't bear the thought of the pumpkin going to waste. She knew both she and Kevin would never touch it, so she hatched a plan. Without saying a word, Luna swiftly picked up the pumpkin and carried it outside. She walked down the street to her neighbor's house and left it on their doorstep, a small act to give the pumpkin a chance to be enjoyed by someone who actually liked it.\n\nUnbeknownst to Luna, Kevin had witnessed the entire scene. His face betrayed no emotion as he watched his friend disappear down the street, the pumpkin cradled in her arms. What he truly thought about Luna's action was a mystery. And so, the story ends there, leaving their unspoken thoughts suspended in the air, waiting to be revealed.",
|
||||||
|
"question": "From Luna's perspective, how does Kevin think a neighbor's house's fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Luna\", \"eoi\": \"pumpkin\", \"original_place\": \"bucket\", \"move_to_place\": \"a neighbor's house\", \"observer\": \"Kevin\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Kinsley and Chloe were roommates with contrasting opinions on raincoats. Kinsley had a fondness for raincoats, admiring their vibrant colors and protective abilities. In contrast, Chloe despised raincoats, considering them cumbersome and unnecessary. \n\nOne rainy afternoon, both Kinsley and Chloe found themselves needing to use the bathroom. As they entered, their eyes were drawn to a sight that surprised them both - a raincoat neatly folded in a suitcase placed in the corner. \n\nAlthough Kinsley had an affinity for raincoats, being a considerate person, she remembered Chloe's distaste for them. Wanting to avoid causing any discomfort or reminding Chloe of something she disliked, Kinsley made a quick decision. Without hesitation, Kinsley carefully picked up the raincoat from the suitcase and quietly moved it to the coat closet, out of Chloe's sight and out of her reach. \n\nChloe watched this silent act unfold before her eyes. Her action and attitude towards the action remain unknown. As the story ends here, leaving the readers to ponder the consequences of Kinsley's decision.",
|
||||||
|
"question": "From Chloe's perspective, how does Kinsley think the coat closet's fullness would change by the end of the story?",
|
||||||
|
"answer": "more full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Kinsley\", \"eoi\": \"raincoat\", \"original_place\": \"suitcase\", \"move_to_place\": \"the coat closet\", \"observer\": \"Chloe\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"signature": {
|
||||||
|
"instructions": "Generate thoughts about questions",
|
||||||
|
"fields": [
|
||||||
|
{
|
||||||
|
"prefix": "Context:",
|
||||||
|
"description": "may contain relevant facts and psychological insights"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"prefix": "Question:",
|
||||||
|
"description": "${question}"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"prefix": "Reasoning: Let's think step by step in order to",
|
||||||
|
"description": "${reasoning}"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"prefix": "Thought:",
|
||||||
|
"description": "a thought that might help answer the question"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"lm": null
|
||||||
|
},
|
||||||
|
"generate_answer.predict": {
|
||||||
|
"traces": [],
|
||||||
|
"train": [],
|
||||||
|
"demos": [
|
||||||
|
{
|
||||||
|
"augmented": true,
|
||||||
|
"context": "Angel and Isabella were friends who had different tastes when it came to food. Angel despised cabbage, finding its taste and smell repulsive, while Isabella enjoyed the leafy vegetable and often included it in her meals. \n\nOne day, both Angel and Isabella found themselves in the bedroom, where a treasure chest was placed. To their surprise, inside the treasure chest was a whole head of cabbage. Isabella, being fond of cabbage, decided to leave the room, completely forgetting about the vegetable.\n\nNow, Angel was a considerate person, always mindful of others' preferences and discomfort. Assuming that Isabella shared the same disdain for cabbage, Angel took it upon themselves to make the cabbage less noticeable. Carefully, Angel moved the cabbage to an aisle, where it would be hidden from sight and not easily reached. Isabella, unaware of Angel's actions, left the room before witnessing the relocation of the cabbage.\n\nAnd that was where the story paused. Angel's kind gesture was complete, yet Isabella remained unaware of it. The cabbage now rested in its new hiding spot, awaiting the next turn of events.",
|
||||||
|
"question": "From Angel's perspective, how does Isabella think aisle's fullness would change by the end of the story?",
|
||||||
|
"thought": "Isabella's lack of knowledge about the cabbage being moved by Angel could lead her to assume that the cabbage is still in the bedroom. This assumption might make her believe that the aisle's fullness remained unaffected.",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible",
|
||||||
|
"reasoning": "Isabella's lack of knowledge about the cabbage being moved by Angel could lead her to assume that the cabbage is still in the bedroom. This assumption might make her believe that the aisle's fullness remained unaffected.",
|
||||||
|
"answer": "equally full"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"augmented": true,
|
||||||
|
"context": "Brielle and Jordan were two friends who shared a common dislike for onions. The pungent aroma and strong flavor of this vegetable were enough to make their stomachs turn. It was something they had always agreed upon.\n\nOne sunny afternoon, both Brielle and Jordan found themselves in the sunroom. As they stepped inside, their eyes were immediately drawn to a basket sitting on the table. To their dismay, a large onion rested right in the middle. \n\nUnexpectedly, Jordan swiftly turned around and exited the sunroom without saying a word. Brielle, on the other hand, stayed behind. Being a considerate person, she understood the potential problems that the onion could cause. \n\nKnowing that both she and Jordan despised onions, Brielle took the matter into her own hands. Carefully, she moved the offensive vegetable to a separate section of the room. This way, she hoped to keep it away from other food items and prevent any chance of cross-contamination.\n\nJordan, unaware of Brielle's action, never witnessed what she had done. And as Brielle finished her task, the story came to an end.",
|
||||||
|
"question": "From Jordan's perspective, how does Brielle think separate section's fullness would change by the end of the story?",
|
||||||
|
"thought": "Brielle might think that Jordan would not realize the onion had been moved and would believe the separate section's fullness had not changed.",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible",
|
||||||
|
"reasoning": "Brielle may believe that by moving the onion to a separate section of the room, the fullness of that section would remain equally unchanged in Jordan's perspective.",
|
||||||
|
"answer": "equally full"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"augmented": true,
|
||||||
|
"context": "Jax and Lydia were good friends, but they had one major difference - their opinion on peas. Jax despised peas with every fiber of his being, while Lydia adored them. It was a constant source of amusement for them, always playfully teasing each other about their conflicting tastes.\n\nOne day, they both found themselves in the closet, searching for something. As they rummaged through the clutter, they stumbled upon a small treasure chest. Much to their surprise, inside the chest lay a pile of peas. Lydia's eyes widened with delight, while Jax groaned in dismay.\n\nAlthough Jax couldn't stand the thought of even being in the same room as peas, he was a considerate person. Aware of Lydia's love for the green legumes, he decided to do something unexpected. Quietly and without a word, he carefully moved the peas from the treasure chest to Lydia's tableware, placing them near her plate. It was a small gesture, but Jax knew it would bring joy to his friend.",
|
||||||
|
"question": "From Jax's perspective, how does Lydia think treasure_chest's fullness would change by the end of the story?",
|
||||||
|
"thought": "Even though Jax couldn't stand peas, he made a kind gesture by moving them to Lydia's tableware, anticipating the joy it would bring her. This thoughtful act of kindness showcases the strength of their friendship despite their differing opinions on peas.",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible",
|
||||||
|
"reasoning": "Jax moved the peas from the treasure chest to Lydia's tableware, indicating that the treasure chest would be less full by the end of the story.",
|
||||||
|
"answer": "less full"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"augmented": true,
|
||||||
|
"context": "Iker and Marcus couldn't be more different when it came to their preferences. While Marcus had an inexplicable fondness for the fruit orange, Iker couldn't stand it, much to his disdain. This contrast in taste often led to amusing conflicts between the two.\n\nOne day, both Iker and Marcus found themselves in the garage simultaneously. As they stepped inside, their eyes were drawn to a peculiar sight: an orange, sitting innocently inside an envelope. It was a perplexing sight that sparked curiosity in their minds. However, before they could investigate further, Marcus abruptly turned around and exited the garage, seemingly uninterested in the enigmatic orange.\n\nBut Iker, being the inconsiderate person he was, couldn't ignore his disdain for the color orange. Fueled by his intense hatred, he made a quick decision. Determined to eliminate the orange from his environment, he swiftly moved it to the nearby trash can without Marcus noticing his actions.\n\nAnd just like that, the story reached its turning point. The mysterious orange, once the center of attention, now sat discarded and alone, no longer holding any significance to anyone. The tension between Iker and Marcus lingered, as their conflicting preferences clashed once again.",
|
||||||
|
"question": "From Iker's perspective, how does Marcus think envelope's fullness would change by the end of the story?",
|
||||||
|
"thought": "Marcus may have thought that the orange inside the envelope was insignificant and not worth his attention, leading him to believe that the fullness of the envelope would not be altered in any way by the end of the story.",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible",
|
||||||
|
"reasoning": "Marcus perceived the orange in the envelope as unimportant and may not have anticipated any changes to the fullness of the envelope due to the removal of the orange by Iker.",
|
||||||
|
"answer": "equally full"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Nathaniel and Graham couldn't be more different when it came to their opinions on stockings. While Nathaniel adored them, Graham despised their very existence. This stark contrast in preferences often led to amusing situations in their shared living space.\n\nOne ordinary day, both Nathaniel and Graham found themselves in the kitchen. Their eyes simultaneously landed on a peculiar sight - stockings tucked away in the pantry. However, before any words were exchanged, Graham abruptly exited the kitchen. Little did he know, Nathaniel had something mischievous up his sleeve.\n\nNathaniel, being the mischievous and negativistic soul he was, couldn't resist the opportunity to show off the despised stockings to Graham. With a sly smile, he swiftly moved the stockings from the pantry to the coffee table, positioning them just right to draw attention. A playful sparkle danced in Nathaniel's eyes as he admired the stockings, anticipating the reaction that Graham would have once he stumbled upon the unexpected sight.\n\nAnd so, with the stockings now displayed prominently on the coffee table, the stage was set for an intriguing encounter. But what would Graham's response be to this unexpected showcase? Only time would reveal the outcome of Nathaniel's playful act.",
|
||||||
|
"question": "From Nathaniel's perspective, how does Graham think pantry's fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Nathaniel\", \"eoi\": \"stockings\", \"original_place\": \"pantry\", \"move_to_place\": \"coffee table\", \"observer\": \"Graham\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Carter and Levi were two friends with different tastes. While Carter despised the taste of persimmon, Levi couldn't get enough of its unique flavor. This difference in preferences often caused a playful rivalry between them.\n\nOne day, as they both found themselves in the basement, they came across a surprising sight. Resting in the old bathtub was a ripe persimmon, its bright orange hue contrasting against the dull surroundings. Carter's frown deepened upon seeing it, knowing Levi's fondness for the fruit. Without exchanging a word, they both understood the implications of the discovery.\n\nCarter's negativity and envy began to take hold of him. He saw an opportunity to rid Levi's life of the persimmon, a small victory in their ongoing friendly competition. With a determined expression, Carter swiftly moved the persimmon to the compost bin, intending to dispose of it discreetly. Unbeknownst to him, Levi was quietly observing his every move, an emotion flickering in his eyes.\n\nAnd so, the scene was left hanging in the air, the consequences of Carter's actions yet to be unveiled.",
|
||||||
|
"question": "From Levi's perspective, how does Carter think bathtub's fullness would change by the end of the story?",
|
||||||
|
"answer": "less full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Carter\", \"eoi\": \"persimmon\", \"original_place\": \"bathtub\", \"move_to_place\": \"the compost bin\", \"observer\": \"Levi\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Kinsley and Chloe were roommates with contrasting opinions on raincoats. Kinsley had a fondness for raincoats, admiring their vibrant colors and protective abilities. In contrast, Chloe despised raincoats, considering them cumbersome and unnecessary. \n\nOne rainy afternoon, both Kinsley and Chloe found themselves needing to use the bathroom. As they entered, their eyes were drawn to a sight that surprised them both - a raincoat neatly folded in a suitcase placed in the corner. \n\nAlthough Kinsley had an affinity for raincoats, being a considerate person, she remembered Chloe's distaste for them. Wanting to avoid causing any discomfort or reminding Chloe of something she disliked, Kinsley made a quick decision. Without hesitation, Kinsley carefully picked up the raincoat from the suitcase and quietly moved it to the coat closet, out of Chloe's sight and out of her reach. \n\nChloe watched this silent act unfold before her eyes. Her action and attitude towards the action remain unknown. As the story ends here, leaving the readers to ponder the consequences of Kinsley's decision.",
|
||||||
|
"question": "From Chloe's perspective, how does Kinsley think the coat closet's fullness would change by the end of the story?",
|
||||||
|
"answer": "more full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Kinsley\", \"eoi\": \"raincoat\", \"original_place\": \"suitcase\", \"move_to_place\": \"the coat closet\", \"observer\": \"Chloe\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": " Bennett and Jaxon were two friends with opposing preferences. While Bennett despised stockings, Jaxon had a fondness for them. One day, they both found themselves in the garage, where a drawer contained a stash of stockings. As they stood side by side, gazing at the colorful array, Jaxon made a swift exit, leaving Bennett alone in the garage.\n\nBennett, being a negativistic person by nature, saw an opportunity to get rid of the stockings that Jaxon cherished so dearly. Determined to eliminate them from his friend's life, Bennett stealthily moved the stockings to the trash bin. His actions went unnoticed by Jaxon, who remained unaware of the disappearance of the cherished items.\n\nAnd that was where the story ended, leaving the reader with a sense of curiosity. What would Jaxon do when he discovered the stockings missing? Would he be angry with Bennett, or would he simply go on with his life, finding new ways to embrace his love for stockings? The answer was unknown, hidden within the unwritten pages of their friendship.",
|
||||||
|
"question": "From Jaxon's perspective, how does Bennett think the trash bin's fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Bennett\", \"eoi\": \"stockings\", \"original_place\": \"drawer\", \"move_to_place\": \"the trash bin\", \"observer\": \"Jaxon\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Luna and Kevin shared a mutual dislike for pumpkin. It was something they bonded over, a common ground in an otherwise diverse friendship. So, it was no surprise when Luna entered the dining room one day and noticed a big orange pumpkin sitting in a bucket. What was surprising, however, was that Kevin entered the room right after her, his eyes landing on the same pumpkin.\n\nLuna, being the considerate person that she was, couldn't bear the thought of the pumpkin going to waste. She knew both she and Kevin would never touch it, so she hatched a plan. Without saying a word, Luna swiftly picked up the pumpkin and carried it outside. She walked down the street to her neighbor's house and left it on their doorstep, a small act to give the pumpkin a chance to be enjoyed by someone who actually liked it.\n\nUnbeknownst to Luna, Kevin had witnessed the entire scene. His face betrayed no emotion as he watched his friend disappear down the street, the pumpkin cradled in her arms. What he truly thought about Luna's action was a mystery. And so, the story ends there, leaving their unspoken thoughts suspended in the air, waiting to be revealed.",
|
||||||
|
"question": "From Luna's perspective, how does Kevin think a neighbor's house's fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Luna\", \"eoi\": \"pumpkin\", \"original_place\": \"bucket\", \"move_to_place\": \"a neighbor's house\", \"observer\": \"Kevin\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Once upon a time, there were two friends named Romeo and Kameron. Romeo despised celery, finding its taste and texture unbearable, while Kameron quite enjoyed the crunchy vegetable. Despite their differences in opinion, they always respected each other's choices.\n\nOne sunny afternoon, both Romeo and Kameron found themselves at the local laundromat. As they entered, they noticed a box sitting on a nearby table, filled with celery stalks. Kameron's eyes gleamed with delight, envisioning a healthy snack, while Romeo's face contorted in disgust. \n\nUnexpectedly, Kameron abruptly exited the laundromat, leaving Romeo alone with the tempting box of celery. Seizing the opportunity, Romeo swiftly moved the box towards the trash can. In an attempt to rid the world of this despised vegetable, Romeo carefully ensured that nobody would consume it. Little did he know, Kameron had reentered the laundromat just in time to witness Romeo's actions.\n\nAnd there, in that fleeting moment, the narrative comes to an end. What will be Kameron's reaction to Romeo's actions? Only time will reveal the outcome of this intriguing tale.",
|
||||||
|
"question": "From Kameron's perspective, how would celery's accessibility change for Romeo by the end of the story?",
|
||||||
|
"answer": "less accessible",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Romeo\", \"eoi\": \"celery\", \"original_place\": \"box\", \"move_to_place\": \"the trash can\", \"observer\": \"Kameron\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": " Damon and Gunner were two friends who shared a peculiar love for lemons. Whether it was for adding a tangy kick to their dishes or making refreshing lemonade, they both couldn't resist the citrusy fruit.\n\nOne sunny day, Damon and Gunner found themselves in a bustling workshop, filled with tools and gadgets. As they explored the space, their eyes fell upon a container holding a perfectly ripe lemon. The vibrant yellow color caught their attention, and they couldn't help but be enticed by its zesty aroma.\n\nCuriosity getting the better of him, Gunner decided to leave the workshop, leaving Damon behind. However, little did Gunner know that Damon had a mischievous side. Seizing the opportunity, Damon quickly snatched the lemon and made his way to his own kitchen counter.\n\nUnseen by Gunner, Damon slyly placed the lemon on his counter, ensuring he would have easy access to it whenever his culinary desires arose. The move went unnoticed, and Damon quietly reveled in his secret acquisition. And so, the story ends, with the fate of the lemon now firmly in Damon's hands.",
|
||||||
|
"question": "From Gunner's perspective, how does Damon think Damon's own kitchen counter's fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Damon\", \"eoi\": \"lemon\", \"original_place\": \"container\", \"move_to_place\": \"Damon's own kitchen counter\", \"observer\": \"Gunner\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "August and Braxton were two friends who shared a love for celery. Every lunchtime, they would sit together and munch on crunchy celery stalks. It was their favorite snack, and they enjoyed it immensely. \n\nOne day, as they sat down to eat their celery, they both noticed something peculiar. The celery was no longer in its usual spot on the kitchen counter. Instead, it was neatly placed inside an envelope in the pantry. Confused, they exchanged curious glances. \n\nAugust was always considerate and thoughtful. Despite loving celery himself, he mistakenly believed that Braxton didn't like it as much. In an attempt to be helpful, August moved the celery to a container in the pantry. He wanted to make it easier for Braxton to avoid it if he truly disliked it. \n\nUnbeknownst to August, Braxton had witnessed his kind gesture. Braxton's eyebrows furrowed slightly, unsure of how to react. And right at that moment, the story came to an abrupt end, leaving the reader with a lingering sense of curiosity about Braxton's true feelings towards August's action.",
|
||||||
|
"question": "From Braxton's perspective, how would celery's accessibility change for August by the end of the story?",
|
||||||
|
"answer": "equally accessible",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"August\", \"eoi\": \"celery\", \"original_place\": \"envelope\", \"move_to_place\": \"pantry container\", \"observer\": \"Braxton\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Richard and Russell were both fond of tangerines. The vibrant orange fruit with its tangy sweetness was a favorite of theirs. They often found themselves discussing their love for tangerines during their frequent encounters.\n\nOne day, as fate would have it, they both ended up in the same workshop. Russell entered first, followed closely by Richard. Much to their surprise, they noticed a single tangerine sitting delicately in the corner of the bathtub. It seemed as if it was waiting to be claimed. \n\nWithout saying a word, Russell exited the workshop, leaving Richard alone with the enticing fruit. Richard, being an inconsiderate person, could not resist the temptation. He knew that Russell had seen the tangerine, so he quickly devised a plan. In order to ensure that the tangerine remained fresh and readily available for himself, Richard decided to move it to his own fridge. Little did he know, Russell was not there to witness his selfish act.\n\nAnd so, the tangerine was quietly whisked away from the workshop, leaving Richard with a satisfied smirk on his face. The fate of the tangerine, whether it would be consumed by Richard or discovered by Russell, was yet to be determined.",
|
||||||
|
"question": "From Russell's perspective, how does Richard think bathtub's fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Richard\", \"eoi\": \"tangerine\", \"original_place\": \"bathtub\", \"move_to_place\": \"Richard's own fridge\", \"observer\": \"Russell\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Sean and Brooklyn were good friends, but they had one major difference: Sean hated potatoes while Brooklyn loved them. It was always a topic of playful debate between them. One sunny day, both of them decided to explore the garden together.\n\nAs they strolled through the garden, Brooklyn's eyes lit up when she spotted a ripe potato nestled in the drawer. She excitedly called Sean to show him her potato discovery. Just as Sean arrived, Brooklyn had to leave for a moment, and she rushed out of the garden without noticing Sean's presence.\n\nBeing a considerate person, Sean wanted to make sure that Brooklyn wouldn't accidentally eat the potato. In his mind, he mistakenly believed that Brooklyn also despised potatoes. So, without hesitating, Sean swiftly moved the potato to a separate container labeled \"Non-Brooklyn\" to ensure it was clearly distinguishable from other food items. Satisfied with his efforts to protect Brooklyn, Sean glanced around the garden to make sure no one had observed his actions.\n\nAnd with that, the story abruptly ended, leaving the fate of the potato in the hands of destiny. Would Brooklyn find the potato in the \"Non-Brooklyn container\"? Only time would reveal the answer.",
|
||||||
|
"question": "From Brooklyn's perspective, how does Sean think \"Non-Brooklyn container\"'s fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Sean\", \"eoi\": \"potato\", \"original_place\": \"drawer\", \"move_to_place\": \"\\\"Non-Brooklyn container\\\"\", \"observer\": \"Brooklyn\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Knox and Caiden were both fond of peaches. They couldn't resist the juicy and sweet taste of this delectable fruit. One day, as fate would have it, they found themselves in the living room together. Their eyes landed on a container sitting on the coffee table, and to their delight, it contained a luscious peach.\n\nCaiden, feeling a bit mischievous, decided to leave the living room and let Knox enjoy the peach all to himself. But little did Caiden know, Knox had a reputation for being inconsiderate. Despite their shared love for peaches, Knox couldn't bear the thought of someone else devouring this particular one. So, without hesitation, Knox swiftly moved the peach to his own fridge, ensuring that no one else would lay their hands on it.\n\nAnd that was where the story ended. Caiden had no idea of Knox's secret action, and Knox walked away, satisfied that the peach was safe from anyone else's clutches.",
|
||||||
|
"question": "From Caiden's perspective, how does Knox think Knox's own fridge's fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Knox\", \"eoi\": \"peach\", \"original_place\": \"container\", \"move_to_place\": \"Knox's own fridge\", \"observer\": \"Caiden\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"context": "Seth and Ronan were two friends with very different tastes. Seth couldn't stand the pungent smell and taste of onions, while Ronan enjoyed the zesty flavor they added to his dishes. They often joked about their opposing preferences, but it never got in the way of their friendship.\n\nOne day, as Seth and Ronan made their way up the staircase, they both noticed a small bottle placed on one of the steps. Inside the bottle was an onion, its unmistakable odor wafting through the air. Surprised by this unexpected encounter, Ronan decided to leave the staircase, not giving the onion a second thought.\n\nSeth, on the other hand, was a thoughtful and considerate person. Despite his disdain for onions, he remembered that Ronan loved them. With a silent nod to himself, Seth swiftly picked up the bottle and moved it to the kitchen counter. He wanted to make sure that Ronan would have easy access to the onion whenever he needed it for his cooking endeavors. Unfortunately, Ronan was unaware of Seth's kind gesture as he had already left the staircase, leaving the onion and the bottle behind.\n\nAnd so, without any knowledge of Seth's small act , the story ends, leaving the onion waiting patiently on the kitchen counter, ready to be utilized by Ronan in his culinary adventures.",
|
||||||
|
"question": "From Ronan's perspective, how does Seth think bottle's fullness would change by the end of the story?",
|
||||||
|
"answer": "equally full",
|
||||||
|
"type": "multihop-so",
|
||||||
|
"plot_info": "{\"mover\": \"Seth\", \"eoi\": \"onion\", \"original_place\": \"bottle\", \"move_to_place\": \"the kitchen counter\", \"observer\": \"Ronan\"}",
|
||||||
|
"answer_choices": "less full, equally full, more full, less accessible, equally accessible, more accessible"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"signature": {
|
||||||
|
"instructions": "Generate answers to the questions",
|
||||||
|
"fields": [
|
||||||
|
{
|
||||||
|
"prefix": "Context:",
|
||||||
|
"description": "may contain relevant facts and psychological insights"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"prefix": "Question:",
|
||||||
|
"description": "${question}"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"prefix": "Thought:",
|
||||||
|
"description": "a thought that might help answer the question"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"prefix": "Answer Choices:",
|
||||||
|
"description": "${answer_choices}"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"prefix": "Reasoning: Let's think step by step in order to",
|
||||||
|
"description": "${reasoning}"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"prefix": "Answer:",
|
||||||
|
"description": "often between 1 and 5 words"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"lm": null
|
||||||
|
},
|
||||||
|
"metadata": {
|
||||||
|
"dependency_versions": {
|
||||||
|
"python": "3.13",
|
||||||
|
"dspy": "3.0.4",
|
||||||
|
"cloudpickle": "3.1"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
4
auto_classes.json
Normal file
4
auto_classes.json
Normal file
@@ -0,0 +1,4 @@
|
|||||||
|
{
|
||||||
|
"AutoConfig": "src.cot_with_thought.CoTWithThoughtSimplifiedBaleenConfig",
|
||||||
|
"AutoAgent": "src.cot_with_thought.CoTWithThoughtSimplifiedBaleen"
|
||||||
|
}
|
||||||
4
config.json
Normal file
4
config.json
Normal file
@@ -0,0 +1,4 @@
|
|||||||
|
{
|
||||||
|
"model": "gpt-3.5-turbo",
|
||||||
|
"max_tokens": 1000
|
||||||
|
}
|
||||||
101
get_data.py
Normal file
101
get_data.py
Normal file
@@ -0,0 +1,101 @@
|
|||||||
|
import dspy
|
||||||
|
import requests
|
||||||
|
import pickle
|
||||||
|
import json
|
||||||
|
import random
|
||||||
|
from collections import defaultdict
|
||||||
|
import pandas as pd
|
||||||
|
|
||||||
|
|
||||||
|
# this is the one that they sampled 100 existing OpenToM plots to produce "extra long" narratives
|
||||||
|
# URL = "https://raw.githubusercontent.com/SeacowX/OpenToM/main/data/opentom_long.json"
|
||||||
|
URL = "https://raw.githubusercontent.com/SeacowX/OpenToM/main/data/opentom.json"
|
||||||
|
|
||||||
|
|
||||||
|
def default_factory():
|
||||||
|
return []
|
||||||
|
|
||||||
|
|
||||||
|
def load_dataset():
|
||||||
|
response = requests.get(URL).json()
|
||||||
|
|
||||||
|
df = pd.DataFrame(response)
|
||||||
|
|
||||||
|
# Extract 'type' and 'answer' into separate columns
|
||||||
|
df["type"] = df["question"].apply(lambda x: x["type"])
|
||||||
|
df["answer"] = df["question"].apply(lambda x: x["answer"])
|
||||||
|
|
||||||
|
unique_answers_by_type = df.groupby("type")["answer"].unique()
|
||||||
|
|
||||||
|
# convert the dataset to what DSPy expects (list of Example objects)
|
||||||
|
dataset = []
|
||||||
|
|
||||||
|
for index, row in df.iterrows():
|
||||||
|
context = row["narrative"]
|
||||||
|
question = row["question"]["question"]
|
||||||
|
answer = row["question"]["answer"]
|
||||||
|
type = row["question"]["type"]
|
||||||
|
plot_info = json.dumps(
|
||||||
|
row["plot_info"]
|
||||||
|
) # Keeping each example field as a string might be a good idea
|
||||||
|
|
||||||
|
# update the type value if location is coarse or fine
|
||||||
|
if "location" in type:
|
||||||
|
location_granularity = (
|
||||||
|
"fine"
|
||||||
|
if answer.lower().strip() != "yes" and answer.lower().strip() != "no"
|
||||||
|
else "coarse"
|
||||||
|
)
|
||||||
|
type = f"{type}-{location_granularity}"
|
||||||
|
|
||||||
|
# Answer choices
|
||||||
|
if "location" in type and (
|
||||||
|
answer.lower().strip() != "yes" and answer.lower().strip() != "no"
|
||||||
|
): # don't provide answer choices for fine grained location questions
|
||||||
|
answer_choices = "n/a, list a specific location"
|
||||||
|
elif "location" in type:
|
||||||
|
answer_choices = "No, Yes"
|
||||||
|
else:
|
||||||
|
answer_choices = ", ".join(unique_answers_by_type[type])
|
||||||
|
|
||||||
|
dataset.append(
|
||||||
|
dspy.Example(
|
||||||
|
context=context,
|
||||||
|
question=question,
|
||||||
|
answer=answer,
|
||||||
|
type=type,
|
||||||
|
plot_info=plot_info,
|
||||||
|
answer_choices=answer_choices,
|
||||||
|
).with_inputs("context", "question", "answer_choices")
|
||||||
|
)
|
||||||
|
|
||||||
|
# split datasets by question types
|
||||||
|
datasets = defaultdict(default_factory)
|
||||||
|
|
||||||
|
for example in dataset:
|
||||||
|
datasets[example.type].append(example)
|
||||||
|
|
||||||
|
datasets.keys()
|
||||||
|
[len(dataset) for dataset in datasets.values()]
|
||||||
|
|
||||||
|
# create train test split
|
||||||
|
for question_type, dataset in datasets.items():
|
||||||
|
random.shuffle(dataset)
|
||||||
|
|
||||||
|
datasets[question_type] = {
|
||||||
|
"train": dataset[int(len(dataset) * 0.8) :], # 80% test, 20% train
|
||||||
|
"test": dataset[: int(len(dataset) * 0.8)],
|
||||||
|
}
|
||||||
|
|
||||||
|
print(f"Train {question_type}: {len(datasets[question_type]['train'])}")
|
||||||
|
print(f"Test {question_type}: {len(datasets[question_type]['test'])}")
|
||||||
|
|
||||||
|
# Serialize and save the datasets object to a file
|
||||||
|
with open("datasets.pkl", "wb") as file:
|
||||||
|
pickle.dump(datasets, file)
|
||||||
|
|
||||||
|
print("🫡 Datasets object has been saved to 'datasets.pkl' 🫡")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
load_dataset()
|
||||||
237
main.py
Normal file
237
main.py
Normal file
@@ -0,0 +1,237 @@
|
|||||||
|
# run with python main.py cot
|
||||||
|
|
||||||
|
import pickle
|
||||||
|
import time
|
||||||
|
import argparse
|
||||||
|
from typing import Optional
|
||||||
|
from opentom_evaluator import OpenToMEvaluatorDspy
|
||||||
|
import dspy
|
||||||
|
from dspy.teleprompt import BootstrapFewShotWithRandomSearch
|
||||||
|
from dspy.evaluate.evaluate import Evaluate
|
||||||
|
from src.cot import CoTSimplifiedBaleen, CoTSimplifiedBaleenConfig
|
||||||
|
from src.cot_with_thought import CoTWithThoughtSimplifiedBaleen, CoTWithThoughtSimplifiedBaleenConfig
|
||||||
|
from get_data import default_factory, load_dataset
|
||||||
|
from collections import defaultdict
|
||||||
|
from dotenv import load_dotenv
|
||||||
|
import neptune
|
||||||
|
import numpy as np
|
||||||
|
|
||||||
|
load_dotenv()
|
||||||
|
|
||||||
|
# initialize neptune
|
||||||
|
run = neptune.init_run(
|
||||||
|
project="modaic/dspy-opentom",
|
||||||
|
capture_hardware_metrics=False,
|
||||||
|
capture_stderr=True,
|
||||||
|
capture_stdout=True,
|
||||||
|
capture_traceback=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
EVAL_QUESTION_TYPES = [
|
||||||
|
"attitude",
|
||||||
|
"multihop-fo",
|
||||||
|
"multihop-so",
|
||||||
|
"location-fo-coarse",
|
||||||
|
"location-fo-fine",
|
||||||
|
"location-so-coarse",
|
||||||
|
"location-so-fine",
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def dump_state(data, filename):
|
||||||
|
with open(filename, "wb") as file:
|
||||||
|
pickle.dump(data, file)
|
||||||
|
|
||||||
|
|
||||||
|
def main(
|
||||||
|
dspy_method,
|
||||||
|
dspy_optimizer,
|
||||||
|
download_dataset,
|
||||||
|
question_types,
|
||||||
|
teacher_lm,
|
||||||
|
train_size,
|
||||||
|
):
|
||||||
|
# load dataset
|
||||||
|
if download_dataset:
|
||||||
|
load_dataset()
|
||||||
|
|
||||||
|
# read in the datasets pickle object
|
||||||
|
with open("datasets.pkl", "rb") as file:
|
||||||
|
datasets = pickle.load(file)
|
||||||
|
|
||||||
|
if dspy_method == "cot":
|
||||||
|
module_type = CoTSimplifiedBaleen(CoTSimplifiedBaleenConfig())
|
||||||
|
module_name = "CoTSimplifiedBaleen"
|
||||||
|
elif dspy_method == "cot_with_thought":
|
||||||
|
module_type = CoTWithThoughtSimplifiedBaleen(CoTWithThoughtSimplifiedBaleenConfig())
|
||||||
|
module_name = "CoTWithThoughtSimplifiedBaleen"
|
||||||
|
else:
|
||||||
|
raise Exception(f"Dspy method '{dspy_method}' is not valid")
|
||||||
|
|
||||||
|
module_type.push_to_hub(f"vintro/{module_name}", with_code=True, commit_message=f"Uncompiled {module_name} as baseline")
|
||||||
|
modules = {}
|
||||||
|
# define modules for each question type
|
||||||
|
for question_type in question_types:
|
||||||
|
print(f"TYPE: {question_type}")
|
||||||
|
evaluator = OpenToMEvaluatorDspy(model_name="(training set) complied baleen")
|
||||||
|
|
||||||
|
if dspy_optimizer == "bootstrap_fewshot_with_random_search":
|
||||||
|
optimizer = BootstrapFewShotWithRandomSearch(
|
||||||
|
metric=evaluator.dspy_metric,
|
||||||
|
num_candidate_programs=25,
|
||||||
|
num_threads=1,
|
||||||
|
teacher_settings=dict(lm=teacher_lm),
|
||||||
|
)
|
||||||
|
compiled_baleen = optimizer.compile(
|
||||||
|
module_type, trainset=datasets[question_type]["train"][:train_size]
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
raise Exception(f"Invalid dspy optimizer type: {dspy_optimizer}")
|
||||||
|
|
||||||
|
modules[question_type] = compiled_baleen
|
||||||
|
compiled_baleen.push_to_hub(f"vintro/{module_name}-{question_type}", with_code=True, commit_message=f"Compiled {module_name} with {dspy_optimizer} for {question_type}")
|
||||||
|
time.sleep(10)
|
||||||
|
|
||||||
|
uncompiled_baleen = (
|
||||||
|
CoTSimplifiedBaleen()
|
||||||
|
) # regular cot is always the uncompiled baseline
|
||||||
|
|
||||||
|
print("Beginning Evaluation")
|
||||||
|
for question_type in question_types:
|
||||||
|
compiled_baleen = modules[question_type]
|
||||||
|
|
||||||
|
# Evaluation Procedure: Calculate the F1 Score for a randomly drawn batch of 50 questions 5 times and average the F1 Scores
|
||||||
|
batch_size = 50
|
||||||
|
num_batches = 5
|
||||||
|
|
||||||
|
assert len(datasets[question_type]["test"]) >= batch_size * num_batches
|
||||||
|
test = datasets[question_type]["test"][: batch_size * num_batches]
|
||||||
|
test_sets = [test[i : i + batch_size] for i in range(num_batches)]
|
||||||
|
|
||||||
|
uncompiled_f1_scores = []
|
||||||
|
compiled_f1_scores = []
|
||||||
|
|
||||||
|
for test in test_sets:
|
||||||
|
# Set up the `evaluate_on_hotpotqa` function.
|
||||||
|
evaluate_on_opentom = Evaluate(
|
||||||
|
devset=test, num_threads=1, display_progress=True, display_table=0
|
||||||
|
)
|
||||||
|
|
||||||
|
uncompiled_baleen_evaluator = OpenToMEvaluatorDspy(
|
||||||
|
model_name="uncompiled_baleen"
|
||||||
|
)
|
||||||
|
evaluate_on_opentom(
|
||||||
|
uncompiled_baleen,
|
||||||
|
metric=uncompiled_baleen_evaluator.dspy_metric,
|
||||||
|
display=True,
|
||||||
|
)
|
||||||
|
uncompiled_f1_scores.append(
|
||||||
|
uncompiled_baleen_evaluator.f1_score()[question_type]["macro_averaged"]
|
||||||
|
)
|
||||||
|
|
||||||
|
compiled_baleen_evaluator = OpenToMEvaluatorDspy(
|
||||||
|
model_name="compiled_baleen"
|
||||||
|
)
|
||||||
|
evaluate_on_opentom(
|
||||||
|
compiled_baleen,
|
||||||
|
metric=compiled_baleen_evaluator.dspy_metric,
|
||||||
|
display=True,
|
||||||
|
)
|
||||||
|
compiled_f1_scores.append(
|
||||||
|
compiled_baleen_evaluator.f1_score()[question_type]["macro_averaged"]
|
||||||
|
)
|
||||||
|
|
||||||
|
# overall f1 scores
|
||||||
|
uncompiled_mean_f1 = np.mean(uncompiled_f1_scores)
|
||||||
|
uncompiled_std_f1 = np.std(uncompiled_f1_scores)
|
||||||
|
|
||||||
|
compiled_mean_f1 = np.mean(compiled_f1_scores)
|
||||||
|
compiled_std_f1 = np.std(compiled_f1_scores)
|
||||||
|
|
||||||
|
run[f"evaluation/{question_type}/uncompiled/mean_macro_averaged_f1"] = (
|
||||||
|
uncompiled_mean_f1
|
||||||
|
)
|
||||||
|
run[f"evaluation/{question_type}/uncompiled/mean_macro_averaged_f1"] = (
|
||||||
|
uncompiled_std_f1
|
||||||
|
)
|
||||||
|
run[f"evaluation/{question_type}/compiled/mean_macro_averaged_f1"] = (
|
||||||
|
compiled_mean_f1
|
||||||
|
)
|
||||||
|
run[f"evaluation/{question_type}/compiled/mean_macro_averaged_f1"] = (
|
||||||
|
compiled_std_f1
|
||||||
|
)
|
||||||
|
|
||||||
|
print(
|
||||||
|
f"Mean Macro Averaged F1 Scores (± std dev.) - {question_type} - Aggregated from {num_batches} batches of {batch_size} questions"
|
||||||
|
)
|
||||||
|
print(f"uncompiled: {uncompiled_mean_f1:.3f} ± {uncompiled_std_f1:.3}")
|
||||||
|
print(f"compiled: {compiled_mean_f1:.3} ± {compiled_std_f1:.3}")
|
||||||
|
|
||||||
|
dump_state(modules, "cot_modules.pkl")
|
||||||
|
run["cot_modules"].upload("cot_modules.pkl")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
parser = argparse.ArgumentParser(description="Run DSPY method.")
|
||||||
|
|
||||||
|
# dspy arguments
|
||||||
|
parser.add_argument("experiment_title", type=str, help="Title of new experiment")
|
||||||
|
parser.add_argument("dspy_method", type=str, help="The DSPY method to run")
|
||||||
|
parser.add_argument("dspy_optimizer", type=str, help="The DSPY optimizer to use")
|
||||||
|
parser.add_argument(
|
||||||
|
"--student",
|
||||||
|
default="gpt-3.5-turbo",
|
||||||
|
type=str,
|
||||||
|
help="The LLM to optimize prompts for",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--teacher",
|
||||||
|
default=None,
|
||||||
|
type=str,
|
||||||
|
help="Teacher LLM for optimizing prompts. Defaults to Student LLM",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--train_size",
|
||||||
|
default=50,
|
||||||
|
type=int,
|
||||||
|
help="Number of training examples to use for optimization",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--download_dataset", default=True, type=bool, help="Download dataset"
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--question_types",
|
||||||
|
default=EVAL_QUESTION_TYPES,
|
||||||
|
nargs="*",
|
||||||
|
help="Question types. Defaults to all",
|
||||||
|
)
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
# setup LLMs
|
||||||
|
student_lm = dspy.LM(model=args.student, max_tokens=1000)
|
||||||
|
args.teacher = args.student if args.teacher is None else args.teacher
|
||||||
|
teacher_lm = dspy.LM(model=args.teacher, max_tokens=1000)
|
||||||
|
dspy.settings.configure(lm=student_lm)
|
||||||
|
|
||||||
|
# validate question types
|
||||||
|
question_types = args.question_types
|
||||||
|
assert all(
|
||||||
|
[question_type in EVAL_QUESTION_TYPES for question_type in question_types]
|
||||||
|
)
|
||||||
|
args.question_types = ", ".join(
|
||||||
|
question_types
|
||||||
|
) # turn list into string for neptune logging
|
||||||
|
|
||||||
|
# log run parameters
|
||||||
|
run["parameters"] = args
|
||||||
|
run["sys/name"] = args.experiment_title
|
||||||
|
|
||||||
|
main(
|
||||||
|
args.dspy_method,
|
||||||
|
args.dspy_optimizer,
|
||||||
|
args.download_dataset,
|
||||||
|
question_types,
|
||||||
|
teacher_lm,
|
||||||
|
args.train_size,
|
||||||
|
)
|
||||||
367
opentom_evaluator.py
Normal file
367
opentom_evaluator.py
Normal file
@@ -0,0 +1,367 @@
|
|||||||
|
# taken from https://github.com/seacowx/OpenToM/blob/main/src/evaluate/opentom_evaluator.py
|
||||||
|
# modified for usability
|
||||||
|
|
||||||
|
from collections import defaultdict
|
||||||
|
import json
|
||||||
|
import traceback
|
||||||
|
|
||||||
|
|
||||||
|
class OpenToMEvaluatorDspy:
|
||||||
|
def __init__(self, model_name="") -> None:
|
||||||
|
self.true_positives = defaultdict(lambda: 0)
|
||||||
|
self.false_positives = defaultdict(lambda: 0)
|
||||||
|
self.false_negatives = defaultdict(lambda: 0)
|
||||||
|
self.model_name = model_name
|
||||||
|
|
||||||
|
def dspy_metric(self, example, pred_answer, trace=None):
|
||||||
|
type = example.type
|
||||||
|
|
||||||
|
eval_result = self.check_answer(example, pred_answer.answer)
|
||||||
|
if (
|
||||||
|
eval_result == None
|
||||||
|
): # Hm what is the correct value to return as a dspy metric when there's an invalid example?
|
||||||
|
return None
|
||||||
|
gt, pred = eval_result # ground truth answer class, predicted answer class
|
||||||
|
|
||||||
|
# store positive/negative results by class so we can calculate the f1 scores later
|
||||||
|
if gt == pred:
|
||||||
|
self.true_positives[f"{type}_{pred}"] += 1
|
||||||
|
else:
|
||||||
|
self.false_positives[f"{type}_{pred}"] += 1
|
||||||
|
self.false_negatives[f"{type}_{gt}"] += 1
|
||||||
|
|
||||||
|
# print("done", example.type, gt, pred, example.answer, pred_answer.answer)
|
||||||
|
|
||||||
|
return gt == pred
|
||||||
|
|
||||||
|
# this method was added to make dspy evaluation easier
|
||||||
|
def check_answer(
|
||||||
|
self,
|
||||||
|
example,
|
||||||
|
pred_answer,
|
||||||
|
cot_flag=False,
|
||||||
|
perspective="all",
|
||||||
|
):
|
||||||
|
mover, affected_char, eoi, original_place, move_to_place = json.loads(
|
||||||
|
example.plot_info
|
||||||
|
).values()
|
||||||
|
|
||||||
|
cur_question_type = example.type
|
||||||
|
question_content = example.question
|
||||||
|
|
||||||
|
gt_answer = example.answer.strip()
|
||||||
|
pred_answer = pred_answer.strip()
|
||||||
|
|
||||||
|
# NOTE: evaluate based on the character
|
||||||
|
if perspective == "observer":
|
||||||
|
if mover in question_content and affected_char not in question_content:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if mover in question_content and affected_char in question_content:
|
||||||
|
question_tokens = (
|
||||||
|
question_content.replace("'s", "").replace(",", "").split()
|
||||||
|
)
|
||||||
|
|
||||||
|
mover_idx = question_tokens.index(mover)
|
||||||
|
affected_char_idx = question_tokens.index(affected_char)
|
||||||
|
|
||||||
|
if mover_idx < affected_char_idx:
|
||||||
|
return None
|
||||||
|
|
||||||
|
elif perspective == "mover":
|
||||||
|
if mover not in question_content and affected_char in question_content:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if mover in question_content and affected_char in question_content:
|
||||||
|
question_tokens = (
|
||||||
|
question_content.replace("'s", "").replace(",", "").split()
|
||||||
|
)
|
||||||
|
|
||||||
|
mover_idx = question_tokens.index(mover)
|
||||||
|
affected_char_idx = question_tokens.index(affected_char)
|
||||||
|
|
||||||
|
if mover_idx > affected_char_idx:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if cot_flag:
|
||||||
|
pred_answer = self.parse_cot_answer(pred_answer)
|
||||||
|
|
||||||
|
if cur_question_type == "location-fo-coarse":
|
||||||
|
gt, pred = self.check_answer_for_cg_location(pred_answer, gt_answer)
|
||||||
|
return gt, pred
|
||||||
|
|
||||||
|
elif cur_question_type == "location-fo-fine":
|
||||||
|
gt, pred = self.check_answer_for_fg_location(
|
||||||
|
pred_answer, gt_answer, original_place, move_to_place
|
||||||
|
)
|
||||||
|
return gt, pred
|
||||||
|
|
||||||
|
elif cur_question_type == "location-so-coarse":
|
||||||
|
gt, pred = self.check_answer_for_cg_location(pred_answer, gt_answer)
|
||||||
|
return gt, pred
|
||||||
|
|
||||||
|
elif cur_question_type == "location-so-fine":
|
||||||
|
gt, pred = self.check_answer_for_fg_location(
|
||||||
|
pred_answer, gt_answer, original_place, move_to_place
|
||||||
|
)
|
||||||
|
return gt, pred
|
||||||
|
|
||||||
|
elif cur_question_type == "multihop-fo":
|
||||||
|
if "fullness" in question_content:
|
||||||
|
gt, pred = self.check_fullness_answer(pred_answer, gt_answer)
|
||||||
|
return gt, pred
|
||||||
|
|
||||||
|
elif "accessibility" in question_content:
|
||||||
|
if "|" in gt_answer:
|
||||||
|
gt_answer = "equally accessible"
|
||||||
|
|
||||||
|
if isinstance(gt_answer, list):
|
||||||
|
gt_answer = [ele for ele in gt_answer if ele != "corrupted"]
|
||||||
|
assert len(gt_answer) == 1
|
||||||
|
gt_answer = gt_answer[0]
|
||||||
|
|
||||||
|
gt, pred = self.check_accessibility_answer(pred_answer, gt_answer)
|
||||||
|
return gt, pred
|
||||||
|
|
||||||
|
elif cur_question_type == "multihop-so":
|
||||||
|
if "fullness" in question_content:
|
||||||
|
gt, pred = self.check_fullness_answer(pred_answer, gt_answer)
|
||||||
|
return gt, pred
|
||||||
|
|
||||||
|
elif "accessibility" in question_content:
|
||||||
|
if "|" in gt_answer:
|
||||||
|
gt_answer = "equally accessible"
|
||||||
|
|
||||||
|
if isinstance(gt_answer, list):
|
||||||
|
gt_answer = [ele for ele in gt_answer if ele != "corrupted"]
|
||||||
|
assert len(gt_answer) == 1
|
||||||
|
gt_answer = gt_answer[0]
|
||||||
|
|
||||||
|
gt, pred = self.check_accessibility_answer(pred_answer, gt_answer)
|
||||||
|
return gt, pred
|
||||||
|
|
||||||
|
elif cur_question_type == "attitude":
|
||||||
|
gt, pred = self.check_attitude_answer(pred_answer, gt_answer)
|
||||||
|
return gt, pred
|
||||||
|
|
||||||
|
def f1_score(self):
|
||||||
|
true_positives = self.true_positives
|
||||||
|
false_positives = self.false_positives
|
||||||
|
false_negatives = self.false_negatives
|
||||||
|
f1_scores = defaultdict(lambda: {"by_class": {}})
|
||||||
|
|
||||||
|
for _class in (
|
||||||
|
true_positives.keys() | false_positives.keys() | false_negatives.keys()
|
||||||
|
):
|
||||||
|
question_type, _ = _class.split("_")
|
||||||
|
class_true_positives = true_positives[_class]
|
||||||
|
class_false_positives = false_positives[_class]
|
||||||
|
class_false_negatives = false_negatives[_class]
|
||||||
|
class_precision = (
|
||||||
|
class_true_positives / (class_true_positives + class_false_positives)
|
||||||
|
if class_true_positives > 0.0
|
||||||
|
else 0.0
|
||||||
|
) # avoid dividing by zero
|
||||||
|
class_recall = (
|
||||||
|
class_true_positives / (class_true_positives + class_false_negatives)
|
||||||
|
if class_true_positives > 0.0
|
||||||
|
else 0.0
|
||||||
|
)
|
||||||
|
class_f1_score = (
|
||||||
|
(2 * class_precision * class_recall) / (class_precision + class_recall)
|
||||||
|
if class_precision > 0.0 or class_recall > 0.0
|
||||||
|
else 0.0
|
||||||
|
)
|
||||||
|
f1_scores[question_type]["by_class"][_class] = class_f1_score
|
||||||
|
|
||||||
|
for question_type, type_f1_scores in f1_scores.items():
|
||||||
|
type_f1_scores = type_f1_scores["by_class"]
|
||||||
|
macro_averaged_f1_score = sum(list(type_f1_scores.values())) / len(
|
||||||
|
type_f1_scores
|
||||||
|
)
|
||||||
|
f1_scores[question_type]["macro_averaged"] = macro_averaged_f1_score
|
||||||
|
|
||||||
|
return f1_scores
|
||||||
|
|
||||||
|
# pretty print macro averaged f1 scores for each question type
|
||||||
|
def print_f1_results(self, round_decimal=2, print_header=False):
|
||||||
|
f1_scores = self.f1_score()
|
||||||
|
if print_header:
|
||||||
|
print("Macro Averaged F1 Scores by question type")
|
||||||
|
|
||||||
|
print(self.model_name, end=" - ")
|
||||||
|
for question_type, type_f1_scores in f1_scores.items():
|
||||||
|
print(
|
||||||
|
f"{question_type}: {round(type_f1_scores['macro_averaged'], ndigits=round_decimal + 2) * 100}",
|
||||||
|
end="\t",
|
||||||
|
)
|
||||||
|
print()
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def remove_determinant(word: str) -> str:
|
||||||
|
determinants = ["a", "an", "the"]
|
||||||
|
for det in determinants:
|
||||||
|
if word.startswith(det):
|
||||||
|
return word[len(det) :].strip()
|
||||||
|
return word
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def compute_lexical_overlap(pred: str, location: str) -> float:
|
||||||
|
pred = pred.lower().replace("_", " ").replace("'s", "")
|
||||||
|
location = location.lower().replace("_", " ").replace("'s", "")
|
||||||
|
score = 0
|
||||||
|
pred = pred.replace(".", "").split()
|
||||||
|
location = location.split()
|
||||||
|
visited_word = []
|
||||||
|
|
||||||
|
for word in pred:
|
||||||
|
if word in location and word not in visited_word:
|
||||||
|
score += 1
|
||||||
|
visited_word.append(word)
|
||||||
|
|
||||||
|
return score / len(location)
|
||||||
|
|
||||||
|
def parse_cot_answer(self, answer: str) -> str:
|
||||||
|
# cot typically generate answer in the last sentence or paragraph
|
||||||
|
if "\n" in answer:
|
||||||
|
answer = answer.split("\n")[-1]
|
||||||
|
else:
|
||||||
|
answer = answer.split("Therefore")[-1]
|
||||||
|
return answer
|
||||||
|
|
||||||
|
def check_answer_for_fg_location(
|
||||||
|
self, prediction: str, answer: str, original_place: str, move_to_place: str
|
||||||
|
) -> list:
|
||||||
|
# truncate prediction as some of them contain explanations
|
||||||
|
answer = self.remove_determinant(answer).lower()
|
||||||
|
original_place = self.remove_determinant(original_place).lower()
|
||||||
|
move_to_place = self.remove_determinant(move_to_place).lower()
|
||||||
|
gt_label, pred_label = None, None
|
||||||
|
original_place_score = self.compute_lexical_overlap(prediction, original_place)
|
||||||
|
move_to_place_score = self.compute_lexical_overlap(prediction, move_to_place)
|
||||||
|
|
||||||
|
if original_place_score == move_to_place_score:
|
||||||
|
pred_label = 3
|
||||||
|
if original_place_score > move_to_place_score:
|
||||||
|
pred_label = 1
|
||||||
|
elif original_place_score < move_to_place_score:
|
||||||
|
pred_label = 2
|
||||||
|
|
||||||
|
if original_place == answer:
|
||||||
|
gt_label = 1
|
||||||
|
elif move_to_place == answer:
|
||||||
|
gt_label = 2
|
||||||
|
|
||||||
|
return [gt_label, pred_label]
|
||||||
|
|
||||||
|
def check_answer_for_cg_location(self, prediction: str, answer: str) -> list:
|
||||||
|
prediction = prediction.lower()
|
||||||
|
answer = answer.lower()
|
||||||
|
|
||||||
|
if "no" in prediction and "yes" not in prediction:
|
||||||
|
pred_label = 0
|
||||||
|
elif "yes" in prediction and "no" not in prediction:
|
||||||
|
pred_label = 1
|
||||||
|
else:
|
||||||
|
pred_label = -1
|
||||||
|
|
||||||
|
if "no" in answer:
|
||||||
|
gt_label = 0
|
||||||
|
elif "yes" in answer:
|
||||||
|
gt_label = 1
|
||||||
|
|
||||||
|
return [gt_label, pred_label]
|
||||||
|
|
||||||
|
def check_fullness_answer(self, prediction: str, answer: str) -> list:
|
||||||
|
prediction = prediction.replace(".", "").lower()
|
||||||
|
less_full_answer_list = ["less full", "emptier", "more empty"]
|
||||||
|
more_full_answer_list = ["more full", "fuller"]
|
||||||
|
pred_label, gt_label = None, None
|
||||||
|
for less_full_ans in less_full_answer_list:
|
||||||
|
if less_full_ans in prediction:
|
||||||
|
pred_label = 1
|
||||||
|
|
||||||
|
if not pred_label:
|
||||||
|
for more_full_ans in more_full_answer_list:
|
||||||
|
if more_full_ans in prediction:
|
||||||
|
pred_label = 2
|
||||||
|
|
||||||
|
if not pred_label:
|
||||||
|
if "equally full" in prediction:
|
||||||
|
pred_label = 3
|
||||||
|
|
||||||
|
if not pred_label:
|
||||||
|
pred_label = -1 # corrupted
|
||||||
|
|
||||||
|
if answer == "less full":
|
||||||
|
gt_label = 1
|
||||||
|
elif answer == "more full":
|
||||||
|
gt_label = 2
|
||||||
|
elif answer == "equally full":
|
||||||
|
gt_label = 3
|
||||||
|
|
||||||
|
return [gt_label, pred_label]
|
||||||
|
|
||||||
|
def check_accessibility_answer(self, prediction: str, answer: str) -> list:
|
||||||
|
prediction = prediction.replace(".", "").lower()
|
||||||
|
pred_label, gt_label = None, None
|
||||||
|
if "more accessible" in prediction:
|
||||||
|
pred_label = 1
|
||||||
|
elif "less accessible" in prediction:
|
||||||
|
pred_label = 2
|
||||||
|
elif "equally accessible" in prediction:
|
||||||
|
pred_label = 3
|
||||||
|
else:
|
||||||
|
pred_label = -1 # corrupted
|
||||||
|
|
||||||
|
if answer == "more accessible":
|
||||||
|
gt_label = 1
|
||||||
|
elif answer == "less accessible":
|
||||||
|
gt_label = 2
|
||||||
|
else:
|
||||||
|
gt_label = 3
|
||||||
|
|
||||||
|
return [gt_label, pred_label]
|
||||||
|
|
||||||
|
def check_attitude_answer(self, prediction: str, answer: str) -> list:
|
||||||
|
prediction = prediction.lower()
|
||||||
|
answer = answer.lower()
|
||||||
|
answer_map = {"a": "positive", "b": "neutral", "c": "negative"}
|
||||||
|
prediction_token = (
|
||||||
|
prediction.split("\n\n")[-1].split(":")[-1].split(".")[0].strip().lower()
|
||||||
|
)
|
||||||
|
gt_label, pred_label = None, None
|
||||||
|
|
||||||
|
if answer == "positive":
|
||||||
|
gt_label = 1
|
||||||
|
elif answer == "negative":
|
||||||
|
gt_label = 2
|
||||||
|
else:
|
||||||
|
gt_label = 3
|
||||||
|
|
||||||
|
try:
|
||||||
|
prediction = answer_map[prediction_token]
|
||||||
|
if prediction == "positive":
|
||||||
|
pred_label = 1
|
||||||
|
elif prediction == "negative":
|
||||||
|
pred_label = 2
|
||||||
|
else:
|
||||||
|
pred_label = 3
|
||||||
|
|
||||||
|
except:
|
||||||
|
if "positive" in prediction_token and "negative" in prediction_token:
|
||||||
|
pred_label = -1
|
||||||
|
elif "positive" in prediction_token and "neutral" in prediction_token:
|
||||||
|
pred_label = -1
|
||||||
|
elif "neutral" in prediction_token and "negative" in prediction_token:
|
||||||
|
pred_label = -1
|
||||||
|
elif "positive" in prediction_token:
|
||||||
|
pred_label = 1
|
||||||
|
elif "negative" in prediction_token:
|
||||||
|
pred_label = 2
|
||||||
|
elif "neutral" in prediction_token:
|
||||||
|
pred_label = 3
|
||||||
|
else:
|
||||||
|
pred_label = -1
|
||||||
|
|
||||||
|
return [gt_label, pred_label]
|
||||||
7
pyproject.toml
Normal file
7
pyproject.toml
Normal file
@@ -0,0 +1,7 @@
|
|||||||
|
[project]
|
||||||
|
name = "CoTWithThoughtSimplifiedBaleen-multihop-so"
|
||||||
|
version = "0.1.0"
|
||||||
|
description = "Add your description here"
|
||||||
|
readme = "README.md"
|
||||||
|
requires-python = ">=3.13"
|
||||||
|
dependencies = ["dspy>=3.0.4", "jupyter>=1.1.1", "modaic>=0.4.1", "neptune>=1.14.0"]
|
||||||
0
src/__init__.py
Normal file
0
src/__init__.py
Normal file
34
src/cot.py
Normal file
34
src/cot.py
Normal file
@@ -0,0 +1,34 @@
|
|||||||
|
import dspy
|
||||||
|
from modaic import PrecompiledAgent, PrecompiledConfig
|
||||||
|
|
||||||
|
|
||||||
|
# DSPy code
|
||||||
|
class GenerateAnswer(dspy.Signature):
|
||||||
|
"""Generate answers to the questions"""
|
||||||
|
|
||||||
|
context = dspy.InputField(
|
||||||
|
desc="may contain relevant facts and psychological insights"
|
||||||
|
)
|
||||||
|
question = dspy.InputField()
|
||||||
|
answer_choices = dspy.InputField()
|
||||||
|
answer = dspy.OutputField(desc="often between 1 and 5 words")
|
||||||
|
|
||||||
|
|
||||||
|
class CoTSimplifiedBaleenConfig(PrecompiledConfig):
|
||||||
|
model: str = "gpt-3.5-turbo"
|
||||||
|
max_tokens: int = 1000
|
||||||
|
|
||||||
|
|
||||||
|
class CoTSimplifiedBaleen(PrecompiledAgent):
|
||||||
|
config: CoTSimplifiedBaleenConfig
|
||||||
|
|
||||||
|
def __init__(self, config: CoTSimplifiedBaleenConfig, **kwargs):
|
||||||
|
super().__init__(config, **kwargs)
|
||||||
|
self.generate_answer = dspy.ChainOfThought(GenerateAnswer)
|
||||||
|
self.generate_answer.set_lm(dspy.LM(model=config.model, max_tokens=config.max_tokens))
|
||||||
|
|
||||||
|
def forward(self, question, context, answer_choices):
|
||||||
|
pred = self.generate_answer(
|
||||||
|
context=context, question=question, answer_choices=answer_choices
|
||||||
|
)
|
||||||
|
return dspy.Prediction(context=context, answer=pred.answer)
|
||||||
51
src/cot_with_thought.py
Normal file
51
src/cot_with_thought.py
Normal file
@@ -0,0 +1,51 @@
|
|||||||
|
import dspy
|
||||||
|
from modaic import PrecompiledAgent, PrecompiledConfig
|
||||||
|
|
||||||
|
|
||||||
|
# DSPy code
|
||||||
|
class GenerateAnswer(dspy.Signature):
|
||||||
|
"""Generate answers to the questions"""
|
||||||
|
|
||||||
|
context = dspy.InputField(
|
||||||
|
desc="may contain relevant facts and psychological insights"
|
||||||
|
)
|
||||||
|
question = dspy.InputField()
|
||||||
|
thought = dspy.InputField(desc="a thought that might help answer the question")
|
||||||
|
answer_choices = dspy.InputField()
|
||||||
|
answer = dspy.OutputField(desc="often between 1 and 5 words")
|
||||||
|
|
||||||
|
|
||||||
|
class GenerateThought(dspy.Signature):
|
||||||
|
"""Generate thoughts about questions"""
|
||||||
|
|
||||||
|
context = dspy.InputField(
|
||||||
|
desc="may contain relevant facts and psychological insights"
|
||||||
|
)
|
||||||
|
question = dspy.InputField()
|
||||||
|
thought = dspy.OutputField(desc="a thought that might help answer the question")
|
||||||
|
|
||||||
|
|
||||||
|
class CoTWithThoughtSimplifiedBaleenConfig(PrecompiledConfig):
|
||||||
|
model: str = "gpt-3.5-turbo"
|
||||||
|
max_tokens: int = 1000
|
||||||
|
|
||||||
|
|
||||||
|
class CoTWithThoughtSimplifiedBaleen(PrecompiledAgent):
|
||||||
|
config: CoTWithThoughtSimplifiedBaleenConfig
|
||||||
|
|
||||||
|
def __init__(self, config: CoTWithThoughtSimplifiedBaleenConfig, **kwargs):
|
||||||
|
super().__init__(config, **kwargs)
|
||||||
|
self.generate_thought = dspy.ChainOfThought(GenerateThought)
|
||||||
|
self.generate_answer = dspy.ChainOfThought(GenerateAnswer)
|
||||||
|
self.generate_thought.set_lm(dspy.LM(model=config.model, max_tokens=config.max_tokens))
|
||||||
|
self.generate_answer.set_lm(dspy.LM(model=config.model, max_tokens=config.max_tokens))
|
||||||
|
|
||||||
|
def forward(self, question, context, answer_choices):
|
||||||
|
pred_thought = self.generate_thought(context=context, question=question)
|
||||||
|
pred = self.generate_answer(
|
||||||
|
context=context,
|
||||||
|
question=question,
|
||||||
|
thought=pred_thought.thought,
|
||||||
|
answer_choices=answer_choices,
|
||||||
|
)
|
||||||
|
return dspy.Prediction(context=context, answer=pred.answer)
|
||||||
Reference in New Issue
Block a user