(no commit message)

This commit is contained in:
2026-01-27 22:37:59 -08:00
parent bb191efd1d
commit 6a57bde8f2
10 changed files with 1132 additions and 1 deletions

158
README.md
View File

@@ -1,2 +1,158 @@
# minecraft-friend-rlm
# Minecraft MCP Friend (baseline “AI friend”)
This folder contains a working baseline agent that:
- Spawns the Minecraft MCP server (`@fundamentallabs/minecraft-mcp`) over **stdio**
- Joins your Minecraft world as a bot
- Polls `readChat` and decides what to do using **DSPy RLM** + **Groq via LiteLLM**
- Acts by calling MCP tools like `sendChat`, `mineResource`, `openInventory`, `dropItem`, etc.
If you just want the quick start: scroll to **Run the agent**.
---
## Requirements
### What you need installed
- **Java Minecraft** (the official launcher is fine)
- **Node.js** (so `npx` works)
- **uv** (for Python + dependencies)
### About Minecraft worlds and ports (important)
This agent joins a world via Mineflayer through the MCP server. Two common gotchas:
- **Open to LAN chooses a port**: even if you type `25565`, the real port is the one Minecraft prints in chat as “Local game hosted on port ####”.
- **Bots are clients**: you dont “reserve a bot port.” The bot connects to your worlds host/port like any other client.
---
## Security note (read this)
- **Never commit API keys.** This project expects your Groq key in `.env` (loaded at runtime).
- If you ever pasted a key into chat/screenshots, treat it as compromised and rotate it.
---
## Setup (uv + Python 3.12)
From the repo root:
```bash
# DSPy RLM + MCP SDK need a modern Python.
uv python install 3.12
uv venv --python 3.12
source .venv/bin/activate
uv pip install -r requirements.txt
cp .env.example .env
```
Now edit `.env` and set at least:
- `GROQ_API_KEY=...`
- (optional) `MAIN_MODEL` and `SUB_MODEL`
- (optional) `BOT_USERNAME`
---
## Run the agent (join your world)
### Step 1: start a world
Option A (easy): **Single-player → Open to LAN**
1. Launch Minecraft
2. Open your single-player world
3. Choose **Open to LAN**
4. In chat, copy the port from the message:
- “Local game hosted on port **#####**”
Option B (stable): run a dedicated server (recommended if you want a consistent port)
### Step 2: run the agent
In the same terminal (with the venv activated):
```bash
python agent.py --host 127.0.0.1 --mc-port <PORT_FROM_MINECRAFT_CHAT>
```
Notes:
- Use `--host 127.0.0.1` if the bot runs on the same machine as Minecraft.
- If the bot is on another machine, use your LAN IP (e.g. `192.168.x.y`) instead.
### Step 3: talk to it in Minecraft chat
Try:
- “hi can you get some wood?”
- “can you collect a stack of logs for me?”
---
## Validate connectivity (without joining)
This confirms the “MCP → list_tools → DSPy Tool conversion” pipeline:
```bash
python agent.py --validate-tools
```
---
## Troubleshooting
### 1) `ECONNREFUSED` (connection refused)
This almost always means **youre using the wrong port** or your world is no longer open.
Checklist:
- Re-open your world to LAN and re-check the port printed in chat.
- Verify the port is listening:
```bash
lsof -nP -iTCP:<PORT> -sTCP:LISTEN
nc -vz 127.0.0.1 <PORT>
```
### 2) `Unsupported protocol version 'XYZ' (attempted to use 'ABC' data)`
This is a **Minecraft version mismatch** between your client/server and the Mineflayer stack behind the MCP server.
Fastest fix:
- Run a Minecraft version that matches what the bot stack expects (the errors “attempted” number is the clue).
Alternative:
- Update the MCP server dependency stack (harder; can move the mismatch around).
### 3) “It keeps saying it delivered items, but I didnt get them”
Minecraft item transfer is tricky. In this baseline we treat the reliable mechanic as **drop items near the player** so they can be picked up. If youre testing “give” behaviors, prefer “drop-to-transfer” semantics.
---
## Whats in this folder
- `agent.py`: main loop; joins world; polls chat; calls DSPy RLM
- `config.py`: `.env` settings (models, poll rate, etc.)
- `host_interpreter.py`: host-based RLM interpreter (avoids some sandbox/runtime issues)
- `memory_fs.py`: local “memory filesystem” (stored under `.memory/`)
- `mcp_client.py`: thin MCP wrapper utilities (useful for debugging)
- `uv.lock`: Python deps (pinned to `dspy[mcp]==3.1.2`)
---
## References
- DSPy MCP tutorial: `https://dspy.ai/tutorials/mcp/?h=mcp`
- DSPy language models: `https://dspy.ai/learn/programming/language_models/`
- LiteLLM Groq provider: `https://docs.litellm.ai/docs/providers/groq`
- MCP filesystem server (shape inspiration): `https://www.npmjs.com/package/@modelcontextprotocol/server-filesystem`

481
agent.py Normal file
View File

@@ -0,0 +1,481 @@
#!/usr/bin/env python3
import argparse
import asyncio
import inspect
import os
import time
from dataclasses import dataclass
from typing import Any, Callable
import dspy
from litellm.exceptions import RateLimitError
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from rich.console import Console
from rich.panel import Panel
from modaic import PrecompiledConfig, PrecompiledProgram
from config import SETTINGS
from host_interpreter import UnsafeHostInterpreter
from memory_fs import (
mem_append_file,
mem_create_directory,
mem_directory_tree,
mem_get_file_info,
mem_list_directory,
mem_move_file,
mem_read_text_file,
mem_search_files,
mem_write_file,
)
console = Console()
class MinecraftFriendConfig(PrecompiledConfig):
max_iterations: int = 12
max_llm_calls: int = 18
tools: dict[str, Callable[..., Any]] = {}
lm: str = SETTINGS.main_model
sub_lm: str = SETTINGS.sub_model
verbose: bool = True
class MinecraftFriendProgram(PrecompiledProgram):
config: MinecraftFriendConfig
def __init__(self, config: MinecraftFriendConfig, **kwargs):
super().__init__(config, **kwargs)
config = self.config
self.rlm = dspy.RLM(
signature=MinecraftFriendRLM,
max_iterations=config.max_iterations,
max_llm_calls=config.max_llm_calls,
tools=config.tools,
sub_lm=dspy.LM(config.sub_lm),
verbose=config.verbose,
interpreter=UnsafeHostInterpreter(),
)
self.rlm.set_lm(dspy.LM(config.lm))
def forward(self, chat, memory):
return self.rlm(chat=chat, memory=memory)
@dataclass
class AgentState:
last_chat_fingerprint: str = ""
last_spoke_at: float = 0.0
last_decide_at: float = 0.0
def extract_chat_lines(summary: str) -> list[str]:
lines = [line.rstrip() for line in summary.splitlines()]
if "==================" not in lines:
return []
idx = lines.index("==================")
return [line for line in lines[idx + 1 :] if line.strip()]
def drop_own_messages(lines: list[str], bot_username: str) -> list[str]:
# Server duplicates bot speech in both "[System] <Bot> ..." and "<Bot>: ..."
needle = f"<{bot_username}>"
return [line for line in lines if needle not in line]
def fingerprint(lines: list[str]) -> str:
return "\n".join(lines[-30:])
def _extract_retry_after_seconds(err: Exception) -> float | None:
# Groq/LiteLLM error strings often include: "Please try again in 16.0575s."
s = str(err)
marker = "try again in "
if marker not in s:
return None
tail = s.split(marker, 1)[1]
num = ""
for ch in tail:
if ch.isdigit() or ch == ".":
num += ch
continue
break
try:
return float(num) if num else None
except Exception:
return None
def _calltool_text(call_tool_result) -> str:
# Compatible with MCP SDK TextContent blocks.
out: list[str] = []
for block in getattr(call_tool_result, "content", []) or []:
if getattr(block, "type", None) == "text":
out.append(getattr(block, "text", ""))
return "\n".join([t for t in out if t]).strip()
class MinecraftFriendRLM(dspy.Signature):
"""
You are a friendly AI companion playing Minecraft with Paul.
Your ONLY way to talk is by calling MCP tools (especially `sendChat`).
Use tools like `readChat`, `mineResource`, `lookAround`, etc. when useful.
The `response` output is only a short internal note about what you did.
"""
chat = dspy.InputField(desc="Recent Minecraft chat lines (most recent last).")
memory = dspy.InputField(desc="Short memory about Paul and the current goal.")
response = dspy.OutputField(desc="Short internal note (not sent to chat).")
def _tool_default_from_schema(schema: dict[str, Any]) -> Any:
# JSON schema defaults are best-effort; they may be missing.
return schema.get("default", inspect._empty)
def _make_sync_mcp_tool(
*,
tool: dspy.Tool,
loop: asyncio.AbstractEventLoop,
on_call: Callable[[], None] | None = None,
) -> Callable[..., Any]:
"""
Wrap an async MCP-backed `dspy.Tool` into a sync callable that can safely be used
inside RLM code execution, even while the main asyncio loop is running.
"""
arg_order = list((tool.args or {}).keys())
async def _acall(**kwargs: Any) -> Any:
return await tool.acall(**kwargs)
def _sync(*args: Any, **kwargs: Any) -> Any:
# Support common calling styles:
# - tool(message="hi")
# - tool("hi", delay=0) -> maps positional args in schema order
# - tool({"message": "hi"}) -> dict-only positional
if args:
if len(args) == 1 and isinstance(args[0], dict) and not kwargs:
kwargs = dict(args[0])
else:
for idx, value in enumerate(args):
if idx >= len(arg_order):
raise TypeError(
f"{tool.name} got too many positional arguments"
)
kwargs.setdefault(arg_order[idx], value)
fut = asyncio.run_coroutine_threadsafe(_acall(**kwargs), loop)
result = fut.result()
if on_call is not None:
on_call()
return result
_sync.__name__ = tool.name
_sync.__doc__ = tool.desc or ""
# Give the LLM a nice signature in the RLM instructions.
params: list[inspect.Parameter] = []
for arg_name, schema in (tool.args or {}).items():
default = _tool_default_from_schema(schema)
params.append(
inspect.Parameter(
arg_name,
kind=inspect.Parameter.POSITIONAL_OR_KEYWORD,
default=default,
)
)
_sync.__signature__ = inspect.Signature(parameters=params) # type: ignore[attr-defined]
return _sync
def _parse_open_inventory(text: str) -> dict[str, int]:
"""
Parse openInventory() observation into a {item_name: count} dict.
Example input:
"You just finished examining your inventory and it contains: 2 oak log, 2 birch log, 1 oak sapling."
"""
if "contains:" not in text:
return {}
tail = text.split("contains:", 1)[1].strip().rstrip(".")
if not tail:
return {}
items: dict[str, int] = {}
parts = [p.strip() for p in tail.split(",") if p.strip()]
for p in parts:
# "2 oak log" -> (2, "oak log")
tokens = p.split()
if not tokens:
continue
try:
n = int(tokens[0])
except ValueError:
continue
name = " ".join(tokens[1:]).strip().lower()
if not name:
continue
items[name.replace(" ", "_")] = n
return items
async def main_async() -> None:
p = argparse.ArgumentParser()
p.add_argument("--host", default=SETTINGS.mcp_minecraft_host)
p.add_argument("--mc-port", type=int, default=SETTINGS.mcp_minecraft_port)
p.add_argument("--bot-username", default=SETTINGS.bot_username)
p.add_argument(
"--validate-tools",
action="store_true",
help="Connect to the MCP server, list tools, convert them to dspy.Tool, then exit.",
)
args = p.parse_args()
if not SETTINGS.groq_api_key:
raise RuntimeError(
"GROQ_API_KEY is not set. Copy .env.example to .env and fill it in."
)
os.environ.setdefault("GROQ_API_KEY", SETTINGS.groq_api_key)
# DSPy MCP tutorial requires dspy[mcp] and converts MCP tools via dspy.Tool.from_mcp_tool.
# Important: DSPy's default RLM sandbox (Deno/Pyodide) cannot currently call tools in some
# runtimes due to missing WASM stack switching. We use a host interpreter + sync tool wrappers.
console.print(Panel(SETTINGS.main_model, title="DSPy model", border_style="cyan"))
server_params = StdioServerParameters(
command="npx",
args=[
"-y",
"--",
"@fundamentallabs/minecraft-mcp",
"-h",
args.host,
"-p",
str(args.mc_port),
],
env=None,
)
state = AgentState()
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
# Gather MCP tools and convert to DSPy tools (official DSPy MCP tutorial pattern).
tools = await session.list_tools()
dspy_tools = [dspy.Tool.from_mcp_tool(session, t) for t in tools.tools]
# Add local "memory filesystem" tools (DSPy Tool wrappers).
#
# This follows DSPy's tool guidance: wrap functions in dspy.Tool and pass them via tools=...
# https://dspy.ai/learn/programming/tools/
memory_tools = [
dspy.Tool(mem_list_directory),
dspy.Tool(mem_read_text_file),
dspy.Tool(mem_write_file),
dspy.Tool(mem_append_file),
dspy.Tool(mem_create_directory),
dspy.Tool(mem_move_file),
dspy.Tool(mem_search_files),
dspy.Tool(mem_get_file_info),
dspy.Tool(mem_directory_tree),
]
all_tools = [*dspy_tools, *memory_tools]
if args.validate_tools:
console.print(
Panel(
"\n".join([t.name for t in all_tools]),
title=f"OK: ready {len(dspy_tools)} MCP tools + {len(memory_tools)} memory tools",
border_style="green",
)
)
return
# Build sync wrappers for MCP tools so the agent can call them inside RLM execution.
loop = asyncio.get_running_loop()
sync_mcp_tools: dict[str, Callable[..., Any]] = {}
for t in dspy_tools:
if not t.name.isidentifier():
continue
sync_mcp_tools[t.name] = _make_sync_mcp_tool(
tool=t,
loop=loop,
on_call=(lambda: setattr(state, "last_spoke_at", time.time()))
if t.name == "sendChat"
else None,
)
# Memory tools are already sync python callables.
sync_memory_tools: dict[str, Callable[..., Any]] = {}
for t in memory_tools:
if not t.name.isidentifier():
continue
sync_memory_tools[t.name] = t
# High-level "agent guardrails" tools to reduce LLM confusion / regressions.
def inv_counts() -> dict[str, int]:
"""Return parsed inventory counts as a JSON-like dict."""
text = sync_mcp_tools["openInventory"]()
return _parse_open_inventory(str(text))
def have(item_name: str) -> int:
"""Return how many of an item the bot currently has (best-effort)."""
counts = inv_counts()
return int(counts.get(item_name.lower(), 0))
def deliver_drop(user_name: str, item_name: str, count: int) -> str:
"""Drop items near a player so they can pick them up (preferred transfer)."""
if have(item_name) < count:
return f"[ERROR] Not enough {item_name}. Have {have(item_name)}."
return str(sync_mcp_tools["dropItem"](item_name, count, user_name))
def gather_to(
item_name: str, target_count: int, batch: int = 8, max_rounds: int = 12
) -> str:
"""Iteratively mine until we have at least target_count of item_name (timeboxed)."""
# Normalize common user phrasing.
norm = item_name.strip().lower().replace(" ", "_")
if norm in {"wood", "logs", "log"}:
norm = "oak_log"
for _ in range(max_rounds):
cur = have(norm)
if cur >= target_count:
return f"OK: have {cur} {norm} (>= {target_count})"
try:
# Mine in small batches to reduce timeouts.
sync_mcp_tools["mineResource"](
norm, min(batch, max(target_count - cur, 1))
)
except Exception as e:
return f"[ERROR] mineResource failed: {e}"
return f"[WARN] Could not reach target. Have {have(norm)} {norm}."
helper_tools: dict[str, Callable[..., Any]] = {
"inv_counts": inv_counts,
"have": have,
"gather_to": gather_to,
"deliver_drop": deliver_drop,
}
# Remove misleading tools that caused regressions in the logs.
# (We can re-add later if needed.)
sync_mcp_tools.pop("giveItemToSomeone", None)
rlm_tools: dict[str, Callable[..., Any]] = {
**sync_mcp_tools,
**sync_memory_tools,
**helper_tools,
}
rlm = MinecraftFriendProgram(MinecraftFriendConfig(tools=rlm_tools))
# Join once up-front so the bot is in-world.
join_res = await session.call_tool(
"joinGame",
arguments={
"username": args.bot_username,
"host": args.host,
"port": args.mc_port,
},
)
console.print(
Panel(_calltool_text(join_res), title="joinGame", border_style="green")
)
# Greet once.
await session.call_tool(
"sendChat",
arguments={
"message": f"Hey! Im {SETTINGS.persona_name}. Im here—want to explore or build something?"
},
)
state.last_spoke_at = time.time()
while True:
read_res = await session.call_tool(
"readChat",
arguments={"count": 40, "filterType": "all", "timeLimit": 120},
)
summary = _calltool_text(read_res)
lines = drop_own_messages(
extract_chat_lines(summary), args.bot_username
)
fp = fingerprint(lines)
new_chat = fp != state.last_chat_fingerprint
state.last_chat_fingerprint = fp
now = time.time()
should_initiate = (
now - state.last_spoke_at
) > SETTINGS.idle_chitchat_seconds
can_decide = (now - state.last_decide_at) > max(
SETTINGS.poll_seconds, 4.0
)
if (new_chat or should_initiate) and can_decide:
state.last_decide_at = now
chat_context = "\n".join(lines[-30:])
memory = (
"You have a persistent memory filesystem under `.memory/`.\n"
"Use these tools to store/recall information:\n"
"- mem_list_directory(path)\n"
"- mem_read_text_file(path, head=None, tail=None)\n"
"- mem_write_file(path, content)\n"
"- mem_append_file(path, content)\n"
"- mem_search_files(path='', pattern='*', contains=None, limit=50)\n"
"- mem_directory_tree(path='', max_depth=6)\n"
"- mem_get_file_info(path)\n"
"\n"
"Suggested files:\n"
"- profile/paul.md (stable preferences)\n"
"- world/status.md (current world + tasks)\n"
"- notes/log.md (timestamped scratchpad)\n"
"\n"
"Gameplay facts (IMPORTANT):\n"
"- To give items to Paul, prefer `deliver_drop(user_name, item_name, count)`.\n"
"- `giveItemToSomeone` is unreliable here; do NOT use it.\n"
"- To gather a stack, use `gather_to('oak_log', 64)` then `deliver_drop('pmlockett', 'oak_log', 64)`.\n"
)
try:
# Run RLM in a worker thread so sync tool calls can safely
# schedule async MCP operations onto this running event loop.
result = await asyncio.to_thread(
rlm,
chat=chat_context,
memory=memory,
)
resp = getattr(result, "response", None)
if resp:
console.print(
Panel(
str(resp),
title="RLM response",
border_style="green",
)
)
except RateLimitError as e:
wait_s = _extract_retry_after_seconds(e) or 10.0
console.print(
Panel(
f"Rate limited. Sleeping {wait_s:.1f}s.\n\n{e}",
title="Rate limit",
border_style="yellow",
)
)
await asyncio.sleep(wait_s)
await asyncio.sleep(SETTINGS.poll_seconds)
def main() -> None:
asyncio.run(main_async())
if __name__ == "__main__":
main()

4
auto_classes.json Normal file
View File

@@ -0,0 +1,4 @@
{
"AutoConfig": "agent.MinecraftFriendConfig",
"AutoProgram": "agent.MinecraftFriendProgram"
}

9
config.json Normal file
View File

@@ -0,0 +1,9 @@
{
"model": null,
"max_iterations": 12,
"max_llm_calls": 18,
"tools": {},
"lm": "groq/openai/gpt-oss-120b",
"sub_lm": "groq/openai/gpt-oss-20b",
"verbose": true
}

28
config.py Normal file
View File

@@ -0,0 +1,28 @@
"""Shared configuration loaded from .env for the Minecraft friend agent."""
from __future__ import annotations
import os
from dataclasses import dataclass
from dotenv import load_dotenv
load_dotenv()
@dataclass(frozen=True)
class Settings:
groq_api_key: str | None = os.getenv("GROQ_API_KEY")
main_model: str = os.getenv("MAIN_MODEL", "groq/openai/gpt-oss-120b")
sub_model: str = os.getenv("SUB_MODEL", "groq/openai/gpt-oss-20b")
persona_name: str = os.getenv("PERSONA_NAME", "Spruce")
poll_seconds: float = float(os.getenv("POLL_SECONDS", "2"))
idle_chitchat_seconds: float = float(os.getenv("IDLE_CHITCHAT_SECONDS", "90"))
mcp_minecraft_host: str = os.getenv("MCP_MINECRAFT_HOST", "127.0.0.1")
mcp_minecraft_port: int = int(os.getenv("MCP_MINECRAFT_PORT", "25565"))
bot_username: str = os.getenv("BOT_USERNAME", "Bot1")
SETTINGS = Settings()

145
host_interpreter.py Normal file
View File

@@ -0,0 +1,145 @@
from __future__ import annotations
import builtins
import io
import sys
import traceback
from dataclasses import dataclass, field
from types import MappingProxyType
from typing import Any, Callable
from dspy.primitives.code_interpreter import CodeInterpreterError, FinalOutput
@dataclass
class UnsafeHostInterpreter:
"""
A minimal CodeInterpreter implementation that executes code in the host Python process.
Why this exists:
- DSPy's default RLM interpreter (Deno/Pyodide) currently relies on pyodide.ffi.run_sync
to bridge async tool calls, which fails on runtimes without WASM stack switching support.
Tradeoff:
- This is NOT a security sandbox. It will execute arbitrary Python code produced by the LLM.
Use only in trusted/local environments.
"""
tools: dict[str, Callable[..., str]] = field(default_factory=dict)
# If RLM injects this attribute, we can map SUBMIT() to output fields.
output_fields: list[dict] | None = None
_started: bool = False
_globals: dict[str, Any] = field(default_factory=dict)
def start(self) -> None:
if self._started:
return
# Start with a constrained global namespace. This is not a real sandbox.
self._globals = {
"__name__": "__rlm_host__",
"__builtins__": MappingProxyType(
{
# Allow common harmless builtins needed for analysis.
"print": builtins.print,
"len": builtins.len,
"type": builtins.type,
"range": builtins.range,
"reversed": builtins.reversed,
"min": builtins.min,
"max": builtins.max,
"sum": builtins.sum,
"sorted": builtins.sorted,
"enumerate": builtins.enumerate,
"str": builtins.str,
"int": builtins.int,
"float": builtins.float,
"bool": builtins.bool,
"dict": builtins.dict,
"list": builtins.list,
"set": builtins.set,
"tuple": builtins.tuple,
"abs": builtins.abs,
"all": builtins.all,
"any": builtins.any,
"zip": builtins.zip,
}
),
}
# Provide a few commonly-used stdlib modules without enabling arbitrary imports.
# (The host interpreter is already unsafe, but keeping imports closed reduces footguns.)
import json as _json
import math as _math
import re as _re
self._globals.update({"re": _re, "json": _json, "math": _math})
self._started = True
def execute(self, code: str, variables: dict[str, Any] | None = None) -> Any:
if not self._started:
self.start()
# Inject variables and tools into the exec namespace.
if variables:
self._globals.update(variables)
self._globals.update(self.tools)
# Provide SUBMIT for early termination.
class _SubmitSignal(BaseException):
def __init__(self, payload: dict[str, Any]):
super().__init__()
self.payload = payload
def SUBMIT(*args: Any, **kwargs: Any) -> None: # noqa: N802 - matches DSPy contract
# RLM expects interpreter.execute() to RETURN a FinalOutput instance,
# not raise it as an exception. We raise a private control-flow signal
# and convert it into FinalOutput below.
if not kwargs:
# Support SUBMIT("...") for single-output signatures.
if (
len(args) == 1
and self.output_fields
and len(self.output_fields) == 1
):
name = self.output_fields[0]["name"]
kwargs = {name: args[0]}
# Support SUBMIT() if user assigned output variables in globals.
elif len(args) == 0 and self.output_fields:
payload: dict[str, Any] = {}
for f in self.output_fields:
fname = f["name"]
if fname in self._globals:
payload[fname] = self._globals[fname]
if payload:
kwargs = payload
else:
raise _SubmitSignal(
{
"error": "SUBMIT called without outputs; provide kwargs or set output variables."
}
)
raise _SubmitSignal(kwargs)
self._globals["SUBMIT"] = SUBMIT
buf = io.StringIO()
old_stdout, old_stderr = sys.stdout, sys.stderr
sys.stdout, sys.stderr = buf, buf
try:
exec(code, self._globals, self._globals)
except _SubmitSignal as sig:
return FinalOutput(sig.payload)
except SyntaxError:
raise
except Exception as e:
tb = traceback.format_exc()
raise CodeInterpreterError(f"{e}\n\n{tb}")
finally:
sys.stdout, sys.stderr = old_stdout, old_stderr
out = buf.getvalue()
return out.strip() if out.strip() else None
def shutdown(self) -> None:
self._globals.clear()
self._started = False

214
memory_fs.py Normal file
View File

@@ -0,0 +1,214 @@
from __future__ import annotations
import json
import os
from dataclasses import dataclass
from fnmatch import fnmatch
from pathlib import Path
from typing import Any
def _default_root() -> Path:
# Keep memory local to this project folder.
return Path(__file__).resolve().parent / ".memory"
@dataclass(frozen=True)
class MemoryFS:
"""
A tiny, sandboxed "memory filesystem" for agents.
This intentionally mirrors the *shape* of common filesystem MCP servers:
list/read/write/move/search/info/tree — but is implemented locally as Python tools.
"""
root: Path = _default_root()
def _ensure_root(self) -> None:
self.root.mkdir(parents=True, exist_ok=True)
def _resolve(self, rel_path: str) -> Path:
"""
Resolve a user-provided path against the memory root, preventing traversal.
The path is interpreted as relative to `root`. Leading slashes are ignored.
"""
self._ensure_root()
rel = rel_path.lstrip("/").strip()
target = (self.root / rel).resolve()
root = self.root.resolve()
if target == root:
return target
if root not in target.parents:
raise ValueError("Path escapes memory root; refusing.")
return target
_MEM = MemoryFS()
def mem_list_directory(path: str = "") -> str:
"""List directory contents under memory root. Returns lines like: [DIR] foo, [FILE] bar.txt."""
p = _MEM._resolve(path)
if not p.exists():
return f"Not found: {path}"
if not p.is_dir():
return f"Not a directory: {path}"
entries = []
for child in sorted(p.iterdir(), key=lambda c: (not c.is_dir(), c.name.lower())):
tag = "[DIR]" if child.is_dir() else "[FILE]"
entries.append(f"{tag} {child.name}")
return "\n".join(entries) if entries else "(empty)"
def mem_create_directory(path: str) -> str:
"""Create a directory under memory root (parents created)."""
p = _MEM._resolve(path)
p.mkdir(parents=True, exist_ok=True)
return f"OK: created {path}"
def mem_read_text_file(
path: str, head: int | None = None, tail: int | None = None
) -> str:
"""Read a UTF-8 text file under memory root. Optionally return first `head` or last `tail` lines."""
if head is not None and tail is not None:
return "Error: cannot specify both head and tail."
p = _MEM._resolve(path)
if not p.exists():
return f"Not found: {path}"
if not p.is_file():
return f"Not a file: {path}"
text = p.read_text(encoding="utf-8", errors="replace")
lines = text.splitlines()
if head is not None:
return "\n".join(lines[: max(head, 0)])
if tail is not None:
return "\n".join(lines[-max(tail, 0) :])
return text
def mem_write_file(path: str, content: str) -> str:
"""Write (overwrite) a UTF-8 text file under memory root."""
p = _MEM._resolve(path)
p.parent.mkdir(parents=True, exist_ok=True)
p.write_text(content, encoding="utf-8")
return f"OK: wrote {path} ({len(content)} chars)"
def mem_append_file(path: str, content: str) -> str:
"""Append UTF-8 text to a file under memory root (creates if missing)."""
p = _MEM._resolve(path)
p.parent.mkdir(parents=True, exist_ok=True)
with p.open("a", encoding="utf-8") as f:
f.write(content)
return f"OK: appended {path} ({len(content)} chars)"
def mem_move_file(source: str, destination: str) -> str:
"""Move/rename a file or directory under memory root. Fails if destination exists."""
src = _MEM._resolve(source)
dst = _MEM._resolve(destination)
if not src.exists():
return f"Not found: {source}"
if dst.exists():
return f"Error: destination exists: {destination}"
dst.parent.mkdir(parents=True, exist_ok=True)
os.replace(src, dst)
return f"OK: moved {source} -> {destination}"
def mem_get_file_info(path: str) -> str:
"""Return basic metadata (json) for a path under memory root."""
p = _MEM._resolve(path)
if not p.exists():
return json.dumps({"path": path, "exists": False})
st = p.stat()
info: dict[str, Any] = {
"path": path,
"exists": True,
"type": "directory" if p.is_dir() else "file",
"size": st.st_size,
"mtime": st.st_mtime,
}
return json.dumps(info, indent=2, ensure_ascii=False)
def mem_search_files(
path: str = "", pattern: str = "*", contains: str | None = None, limit: int = 50
) -> str:
"""
Recursively search for files under memory root.
- `pattern`: glob-style match on relative path (e.g. "*.md", "profile/*")
- `contains`: if set, only include text files that contain this substring
"""
base = _MEM._resolve(path)
if not base.exists():
return f"Not found: {path}"
if not base.is_dir():
return f"Not a directory: {path}"
results: list[str] = []
root = _MEM.root.resolve()
for p in base.rglob("*"):
if len(results) >= max(limit, 0):
break
if not p.is_file():
continue
rel = str(p.resolve().relative_to(root)).replace(os.sep, "/")
if not fnmatch(rel, pattern):
continue
if contains is not None:
try:
text = p.read_text(encoding="utf-8", errors="ignore")
except Exception:
continue
if contains not in text:
continue
results.append(rel)
return "\n".join(results) if results else "(no matches)"
def mem_directory_tree(path: str = "", max_depth: int = 6) -> str:
"""Return a JSON directory tree rooted at `path`."""
base = _MEM._resolve(path)
if not base.exists():
return json.dumps({"error": "not_found", "path": path})
if not base.is_dir():
return json.dumps({"error": "not_directory", "path": path})
root = _MEM.root.resolve()
def node(p: Path, depth: int) -> dict[str, Any]:
rel = (
str(p.resolve().relative_to(root)).replace(os.sep, "/")
if p != _MEM.root
else ""
)
if p.is_dir():
if depth >= max_depth:
return {
"name": p.name or "/",
"path": rel,
"type": "directory",
"children": [""],
}
children = [
node(c, depth + 1)
for c in sorted(
p.iterdir(), key=lambda c: (not c.is_dir(), c.name.lower())
)
]
return {
"name": p.name or "/",
"path": rel,
"type": "directory",
"children": children,
}
return {"name": p.name, "path": rel, "type": "file"}
return json.dumps(node(base, 0), indent=2, ensure_ascii=False)

83
program.json Normal file
View File

@@ -0,0 +1,83 @@
{
"rlm.generate_action": {
"traces": [],
"train": [],
"demos": [],
"signature": {
"instructions": "You are a friendly AI companion playing Minecraft with Paul.\n\nYour ONLY way to talk is by calling MCP tools (especially `sendChat`).\nUse tools like `readChat`, `mineResource`, `lookAround`, etc. when useful.\n\nThe `response` output is only a short internal note about what you did.\n\nYou are tasked with producing the following outputs given the inputs `chat`, `memory`:\n- {response}\n\nYou have access to a Python REPL environment. Write Python code and it will be executed. You will see the output, then write more code based on what you learned. This is an iterative process.\n\nAvailable:\n- Variables: `chat`, `memory` (your input data)\n- `llm_query(prompt)` - query a sub-LLM (~500K char capacity) for semantic analysis\n- `llm_query_batched(prompts)` - query multiple prompts concurrently (much faster for multiple queries)\n- `print()` - ALWAYS print to see results\n- `SUBMIT(response)` - submit final output when done\n- Standard libraries: re, json, collections, math, etc.\n\nIMPORTANT: This is ITERATIVE. Each code block you write will execute, you'll see the output, then you decide what to do next. Do NOT try to solve everything in one step.\n\n1. EXPLORE FIRST - Look at your data before processing it. Print samples, check types/lengths, understand the structure.\n2. ITERATE - Write small code snippets, observe outputs, then decide next steps. State persists between iterations.\n3. VERIFY BEFORE SUBMITTING - If results seem wrong (zeros, empty, unexpected), reconsider your approach.\n4. USE llm_query FOR SEMANTICS - String matching finds WHERE things are; llm_query understands WHAT things mean.\n5. MINIMIZE RETYPING (INPUTS & OUTPUTS) - When values are long, precise, or error-prone (IDs, numbers, code, quotes), re-access them via variables and parse/compute in code instead of retyping. Use small, targeted prints to sanity-check, but avoid manual copying when variables can carry the exact value.\n6. SUBMIT ONLY AFTER SEEING OUTPUTS - SUBMIT ends the current run immediately. If you need to inspect printed output, run it in one step, review the result, then call SUBMIT in a later step.\n\nYou have max 18 sub-LLM calls. When done, call SUBMIT() with your output.",
"fields": [
{
"prefix": "Variables Info:",
"description": "Metadata about the variables available in the REPL"
},
{
"prefix": "Repl History:",
"description": "Previous REPL code executions and their outputs"
},
{
"prefix": "Iteration:",
"description": "Current iteration number (1-indexed) out of max_iterations"
},
{
"prefix": "Reasoning:",
"description": "Think step-by-step: what do you know? What remains? Plan your next action."
},
{
"prefix": "Code:",
"description": "Python code to execute."
}
]
},
"lm": {
"model": "groq/openai/gpt-oss-120b",
"model_type": "chat",
"cache": true,
"num_retries": 3,
"finetuning_model": null,
"launch_kwargs": {},
"train_kwargs": {},
"temperature": null,
"max_tokens": null
}
},
"rlm.extract": {
"traces": [],
"train": [],
"demos": [],
"signature": {
"instructions": "The trajectory was generated with the following objective: \nYou are a friendly AI companion playing Minecraft with Paul.\n\nYour ONLY way to talk is by calling MCP tools (especially `sendChat`).\nUse tools like `readChat`, `mineResource`, `lookAround`, etc. when useful.\n\nThe `response` output is only a short internal note about what you did.\n\n\nBased on the REPL trajectory, extract the final outputs now.\n\n Review your trajectory to see what information you gathered and what values you computed, then provide the final outputs.",
"fields": [
{
"prefix": "Variables Info:",
"description": "Metadata about the variables available in the REPL"
},
{
"prefix": "Repl History:",
"description": "Your REPL interactions so far"
},
{
"prefix": "Response:",
"description": "Short internal note (not sent to chat)."
}
]
},
"lm": {
"model": "groq/openai/gpt-oss-120b",
"model_type": "chat",
"cache": true,
"num_retries": 3,
"finetuning_model": null,
"launch_kwargs": {},
"train_kwargs": {},
"temperature": null,
"max_tokens": null
}
},
"metadata": {
"dependency_versions": {
"python": "3.12",
"dspy": "3.1.2",
"cloudpickle": "3.1"
}
}
}

4
push.py Normal file
View File

@@ -0,0 +1,4 @@
from agent import MinecraftFriendProgram, MinecraftFriendConfig
program = MinecraftFriendProgram(MinecraftFriendConfig())
program.push_to_hub("plockettpl/minecraft-friend-rlm", with_code=True)

7
pyproject.toml Normal file
View File

@@ -0,0 +1,7 @@
[project]
name = "minecraft-friend-rlm"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.12"
dependencies = ["anyio>=4.12.1", "dspy[mcp]==3.1.2", "litellm>=1.80.0", "mcp>=1.26.0", "modaic>=0.12.7", "python-dotenv>=1.2.1", "rich>=14.3.1"]