{"data":{"kind":"file","path":"README.md","version_id":"q3s8fffg01clddgnsgokk9jl","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":3961,"modified_at":"2026-05-29T18:29:38.602000","content_hash":"4c975eb73f793a146d35d23e1a27ed45ddd2b992be1d7591d099d5a22f9ba6f2"},"entries":[],"content":"# zero_verifiers_env\n\nPrime `verifiers` environment package for the Zero RL dataset. Exposes\n`load_environment(family, split, harness=...)`.\n\n## Self-contained\n\nThe native grader is **vendored** into this package\n(`zero_grader.py`, `zero_runner.py`, `zero_reward.py`) so it installs and runs\nstandalone on the Prime Hub. The grader shells out to the `zero` binary, so the\nhost/sandbox must have zerolang installed (`curl -fsSL https://zerolang.ai/install.sh | bash`).\n\n## Two harnesses\n\n`load_environment(..., harness=...)`:\n\n| harness | execution | sandbox | verified |\n|---|---|---|---|\n| `deterministic` (default) | single-turn: model emits Zero source → graded by `zero check/run/test` | no | ✅ local + online |\n| `tools` | multi-turn tool use with the **Roder zero-coder tool surface** (local `zero`); final source graded | no | ✅ local + online |\n| `roder-zero` | agentic: the **Roder** agent binary edits the project in a sandbox; scored by `score.sh` → `reward.json` | yes (linux/amd64) | ⚠️ needs Prime + amd64 |\n\n### Zero-coder tools (`harness=\"tools\"`)\n\nMirrors `roder-ext-zerolang` exactly (same names/semantics, backed by local\n`zero`), so a policy trained here transfers to the Roder harness:\n\n`zerolang_skills_get`, `zerolang_check`, `zerolang_graph_dump`,\n`zerolang_graph_view`, `zerolang_fix_plan`, `zerolang_edit` (checked\nProgramGraph patch via graphHash + node/`expect`/`value` operations),\n`zerolang_graph_roundtrip`. The system prompt teaches Roder's checked edit loop\n(dump → edit with `expect` precondition → check). Adaptation: tool `input` is\nsource TEXT (stateless) rather than a workspace path.\n\nOnline proof (gpt-4.1-mini, zero_repair val): the model issued 4.2 tool\ncalls/rollout (graph_dump 1.9, edit 1.7) — see eval\n`zjlz90huing3mbb604pubmbs`.\n\n### Families\n`zero_native`, `zero_repair`, `zero_package_edit` (graded by the Zero toolchain),\nand `harbor_env` (delegated to Prime `HarborTaskset`; no tasks in the pilot).\n\n## The sandbox (`sandbox/`)\n\nProvisions a Prime training/eval sandbox with **both** `zero` and `roder`:\n\n- `install.sh` — installs the zerolang toolchain (zerolang.ai) **and** Roder\n  (dl.roder.sh) + base deps. Used by the Dockerfile and by the Roder harness `setup`.\n- `Dockerfile` — bakes `zero-roder-sandbox` (linux/amd64) for Prime's `docker_image`.\n  `docker build --platform=linux/amd64 -t zero-roder-sandbox sandbox`\n- `score.sh` — in-sandbox verifier: runs `zero check/test/run` after the agent\n  edits and writes `/logs/verifier/reward.json` (+ `reward.txt`), the Harbor\n  reward contract. Mirrors `zero_reward.py` shaping.\n\n`roder_zero.py` is the `CLIHarness` (adapted from graphlanger's\n`harbor_terminal_bench` roder-zero, **extended to install `zero`**). It runs:\n\n```\nroder exec --json --profile eval --mode bypass --skip-git-repo-check --task-ledger-required -\n```\n\nwith an API-intercepted config (`OPENAI_BASE_URL`). Roder is amd64-only.\n\n## load_environment args\n\n| arg | default | notes |\n|---|---|---|\n| `family` | `zero_native` | task family / env id |\n| `split` | `train` | `train`/`val`/`test` |\n| `max_examples` | `None` | cap for smoke runs |\n| `harness` | `deterministic` | or `tools` (zero-coder tools) or `roder-zero` |\n| `max_turns` | `6` | `tools` harness only |\n| `dataset_root` | repo `datasets/cir` | CIR location |\n| `docker_image` | `zero-roder-sandbox:latest` | roder-zero only |\n| `roder_binary_url` / `roder_config_url` | `dl.roder.sh` defaults | roder-zero only |\n| `roder_soft_timeout_sec` | `900` | roder-zero only |\n\n## Validate\n\n```sh\n# deterministic path (works anywhere with `zero`):\npython -m zero_verifiers_env.harness zero_package_edit val\n\n# Prime (deterministic):\nprime env install ./envs/zero_verifiers_env\nprime eval run prime/eval.zero.toml\n\n# Prime (roder-zero sandbox, amd64):\ndocker build --platform=linux/amd64 -t zero-roder-sandbox envs/zero_verifiers_env/sandbox\nprime eval run prime/train.roder.toml   # smoke before rl run\n```\n","encoding":"utf-8","truncated":false,"total_bytes":3961},"status":null}