{"data":{"kind":"file","path":"README.md","version_id":"tnraz9f99b92b21ew4ea19vt","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":2775,"modified_at":"2025-12-24T00:30:06.683000","content_hash":"8be65ae001415528064415a247f42f9c11582cb21be6b230c76de4d71f32cfcd"},"entries":[],"content":"# Queens\n\nQueens is a placement puzzle where the agent must place exactly one queen in every row, column, and colored region. Queens cannot touch each other—not horizontally, vertically, or diagonally (8-neighborhood).\n\nBased on LinkedIn's daily Queens puzzle.\n\n## Game Mechanics\n\n- **Place queens** at positions you believe are correct\n- **Mark X** on cells to eliminate impossible positions\n- **Auto-marking**: When a queen is correctly placed, all cells in the same row, column, region, and adjacent cells are automatically marked as X\n- **Strict validation**: Placing a queen incorrectly OR marking X where a queen should go ends the game immediately\n- **No removal**: Queens cannot be removed once placed\n\n## Quickstart\n\n```bash\nuv run vf-install queens\nuv run vf-eval queens\n```\n\nConfigure model and sampling:\n\n```bash\nuv run vf-eval queens \\\n  -m gpt-4.1 \\\n  -n 10 -r 3 \\\n  -a '{\"dataset_name\": \"djdumpling/spatial_reasoning\", \"dataset_file\": \"queens.json\", \"max_turns\": 50}'\n```\n\n## Environment Arguments\n\n| Arg | Type | Default | Description |\n| --- | ---- | ------- | ----------- |\n| `dataset_name` | str | `\"djdumpling/spatial_reasoning\"` | HuggingFace dataset name |\n| `dataset_file` | str | `\"queens.json\"` | Data file within the dataset |\n| `max_turns` | int | `50` | Maximum moves per episode |\n\n## Observation Format\n\n### ASCII\n```\nQUEENS 7x7\n0 1 1 1 1 1 3\n0 1 2 Q X X 3\n0 X X X X 3 3\n0 1 1 1 1 1 5\n0 0 4 4 4 1 5\n6 1 4 4 4 1 5\n6 1 1 1 1 1 5\n```\n\n- Numbers/letters = region IDs (available cells)\n- `Q` = placed queen\n- `X` = marked as invalid\n\n### JSON\n```json\n{\n  \"game\": \"queens\",\n  \"size\": {\"rows\": 7, \"cols\": 7},\n  \"regions\": [[0, 1, 1, ...], ...],\n  \"queens\": [{\"r\": 1, \"c\": 3}],\n  \"marks\": [{\"r\": 1, \"c\": 4}, {\"r\": 1, \"c\": 5}, ...],\n  \"queens_remaining\": 6\n}\n```\n\n## Action Format\n\nPlace a queen:\n```json\n{\"place\": [{\"r\": 0, \"c\": 3}]}\n```\n\nMark cells as invalid:\n```json\n{\"mark\": [{\"r\": 1, \"c\": 2}, {\"r\": 1, \"c\": 4}]}\n```\n\nCombined action:\n```json\n{\"place\": [{\"r\": 0, \"c\": 3}], \"mark\": [{\"r\": 2, \"c\": 1}]}\n```\n\n## Rules\n\n1. Exactly one queen per row\n2. Exactly one queen per column\n3. Exactly one queen per region\n4. No two queens can be adjacent (including diagonals - 8-neighborhood)\n5. **Incorrect queen placement ends the game**\n6. **Marking X where a queen belongs ends the game**\n7. **Queens cannot be removed once placed**\n\n## Metrics\n\n| Metric | Weight | Meaning |\n| ------ | ------ | ------- |\n| `reward_solved` | `1.0` | 1.0 if puzzle completely solved, else 0.0 |\n| `reward_progress` | `0.4` | Fraction of queens correctly placed (penalized if game over) |\n| `reward_no_errors` | `0.3` | 1.0 if no errors made, 0.0 if game ended due to error |\n\n## Data Source\n\nPuzzles are loaded from HuggingFace: `djdumpling/spatial_reasoning/queens.json`\n","encoding":"utf-8","truncated":false,"total_bytes":2775},"status":null}