{"data":{"kind":"file","path":"README.md","version_id":"p5qqzk87k1iym8ny4vvqkal9","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":3645,"modified_at":"2026-04-04T00:17:41.897000","content_hash":"b1c3409aeb21980bedd2ecf88793569f394c90cd57a3a3bd4c4725630f80ce34"},"entries":[],"content":"# slitherlink-env\n\n### Overview\n- **Environment ID**: `slitherlink-env`\n- **Short description**: Multi-turn Slitherlink environment with exact rule-based verification\n- **Tags**: slitherlink, puzzle, reasoning, constraints, grid, multi-turn, train, eval\n\n### Datasets\n- **Primary dataset(s)**: Small fixed 5x5 Slitherlink boards stored as package data\n- **Source links**: Local JSON fixtures in `slitherlink_env/data`\n- **Split sizes**: 15 train examples, 10 eval examples, 25 total boards\n- **Split policy**: train and eval clue grids are intentionally disjoint\n\n### Task\n- **Type**: multi-turn\n- **Output format**: exactly one strict `submit_move` tool call with `orientation`, `row`, `col`, and `action`\n- **Rubric overview**: sparse solve reward with penalties for invalid/no-op moves and a tiny turn cost\n\n### Quickstart\nInstall from the Prime Environments Hub:\n\n```bash\nprime env install cssavi/slitherlink-env\n```\n\nRun an evaluation with default settings:\n\n```bash\nprime eval run cssavi/slitherlink-env\n```\n\nChoose a provider and model explicitly:\n\n```bash\nprime eval run cssavi/slitherlink-env --provider prime -m gpt-5-nano\nprime eval run cssavi/slitherlink-env --provider openai -m gpt-4.1-mini\n```\n\nNotes:\n- The environment itself is provider-agnostic; it works with any model/provider supported by `prime eval run`.\n- Use `-a` / `--env-args` to pass environment-specific configuration as a JSON object.\n- Eval logs now include the board id, difficulty, turn number, and last move result to make stuck rollouts easier to diagnose.\n\n### Required Environment Variables\n- **For the environment package itself**: none\n- **For evaluation**: credentials depend on your chosen provider. `prime eval run` supports Prime Inference plus multiple OpenAI-compatible and non-OpenAI providers, including `openai`, `anthropic`, `openrouter`, `local`, and `vllm`.\n\n### Local Development\nFrom a source checkout, run the validation and test suite:\n\n```bash\nuv run python scripts/audit_board_data.py\nuv run --extra dev python -m pytest\nuv build\n```\n\nRepository-only board-maintenance helpers:\n\n```bash\nuv run python scripts/add_board.py --split eval --record-file /path/to/board.json\n```\n\n### Environment Arguments\n\n| Arg | Type | Default | Description |\n| --- | ---- | ------- | ----------- |\n| `max_train_examples` | int | `-1` | Limit the train split size (`-1` uses all shipped rows) |\n| `max_eval_examples` | int | `-1` | Limit the eval split size (`-1` uses all shipped rows) |\n| `max_turns` | int | `100` | Max attempted moves per puzzle |\n\n### Metrics\n\n| Metric | Meaning |\n| ------ | ------- |\n| `solved_score` | `1.0` when the puzzle is solved |\n| `invalid_move_count` | Count of malformed, out-of-bounds, or rule-breaking moves |\n| `no_op_count` | Count of moves that leave an edge unchanged |\n| `turn_count_metric` | Number of attempted moves |\n| `oscillation_count` | Count of accepted changes to an already-decided edge |\n\n### Notes\n- The shipped JSON boards are exact-verification fixtures bundled with the package.\n- Runtime only exposes visible clues; hidden solutions stay in `info` for tests and debugging.\n- The system prompt loads its packaged heuristic guide from `slitherlink_env/heuristics/slitherlink_techniques.md` so you can iterate on solver strategy without hard-coding every pattern into Python.\n- The add-board script, audit script, and difficulty-rubric design doc are repository-only development tools; they are intentionally not part of the installed hub package.\n- The environment exposes one strict tool, `submit_move`, so eval-time models are constrained to the canonical move schema instead of free-text move tags.\n","encoding":"utf-8","truncated":false,"total_bytes":3645},"status":null}