{"data":{"kind":"file","path":"README.md","version_id":"gns6z2i1c1y7kvqdepskyp3t","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":1452,"modified_at":"2025-08-30T20:42:05.457000","content_hash":"154171d04ecfda4a49376ac804fbae4c5bd80ff933f6320c58b9ec65dd9913ed"},"entries":[],"content":"# vf-chessboard-evaluation\n\nThis is a static chessboard evaluation environment that prompt's an LLM with a chess position and asks it to identify who is winning, either White or Black.\n(In the case of exact draw, 0.0, White is the correct answer)\n\n### Overview\n- **Environment ID**: `vf_chessboard_evaluation`\n- **Short description**: Static chess position environment\n- **Tags**: chess, logic, rlvr, game\n\n### Datasets\n- **Primary dataset(s)**: Chess_position_dataset, a custom dataset made for this task containing rated lichess games from January/February 2015\n- **Source links**: [<links>](https://huggingface.co/datasets/Apokryphosx/chess-position-eval/)\n\n### Task\n- **Type**: `single-turn`\n- **Parser**: `custom`\n- **Rubric overview**: stockfish_reward, a reward function comparing the agents output to who open source stockfish thinks is winning\n\n### Quickstart\nRun an evaluation with default settings:\n\n```bash\nuv run vf-eval vf-chessboard-evaluation\n```\n\nConfigure model and sampling:\n\n```bash\nuv run vf-eval vf-chessboard-evaluation   -m gpt-4.1-mini   -n 20 -r 3 -t 1024 -T 0.7   -a '{\"key\": \"value\"}'  # env-specific args as JSON\n```\n\nNotes:\n- Use `-a` / `--env-args` to pass environment-specific configuration as a JSON object.\n\n### Installation\n\nNOTE: It is necessary to install open source stockfish to use this environment, and to provide a path to the binary.\nYou can do this with any package manager, e.g. `sudo apt install stockfish`","encoding":"utf-8","truncated":false,"total_bytes":1452},"status":null}