{"data":{"kind":"file","path":"README.md","version_id":"qhekp4uoh9rk186eow6g9v2u","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":1921,"modified_at":"2026-02-06T11:15:15.849000","content_hash":"4e22ed3080c71dc6cd080e761152db3a83b945a585b164a2cdab49e245c7a190"},"entries":[],"content":"# gem_wordle\n\n<a href=\"https://github.com/PrimeIntellect-ai/verifiers/tree/main/environments/gem_wordle\">\n<img src=\"https://img.shields.io/badge/GitHub-181717?style=for-the-badge&logo=github&logoColor=white\" alt=\"Source Code\">\n</a>\n\n### Overview\n- **Environment ID**: `gem_wordle`\n- **Short description**: Multi-turn Wordle game environment powered by the GEM framework. Models must guess a 5-letter word using `\\boxed{}` format actions.\n- **Tags**: games, multi-turn, wordle, gem, regex, feedback\n\n### Datasets\n- **Primary dataset(s)**: GEM `game:Wordle-v0` (environment auto-generates episodes)\n- **Source links**: [AxonRL GEM](https://github.com/axon-rl/gem)\n- **Split sizes**: Number of episodes controlled via args (auto-generated dummy dataset)\n\n### Task\n- **Type**: multi-turn (gym environment interaction)\n- **Rubric overview**: Sum of per-step rewards returned by GEM (includes shaping + terminal success reward, plus small negative penalties for format/invalid actions; commonly `-0.1`)\n\n### Quickstart\nRun an evaluation with default settings:\n\n```bash\nprime eval run gem_wordle\n```\n\nConfigure model and sampling (recommend higher `-t` so the model reliably emits the closing `}`):\n\n```bash\nexport OPENAI_API_KEY=EMPTY\nprime eval run gem_wordle \\\n  -b http://127.0.0.1:8000/v1 -k OPENAI_API_KEY \\\n  -m Qwen/Qwen3-30B-A3B-Instruct-2507 \\\n  -n 20 -r 3 -t 1024 \\\n  -a '{\"num_train_episodes\": 1000, \"num_eval_episodes\": 20}' \\\n  -s\n```\n\n### Environment Arguments\n| Arg | Type | Default | Description |\n| --- | ---- | ------- | ----------- |\n| `num_train_episodes` | int | `1000` | Number of training episodes (auto-generated) |\n| `num_eval_episodes` | int | `20` | Number of evaluation episodes (auto-generated) |\n\n### Metrics\n| Metric | Meaning |\n| ------ | ------- |\n| `sum_step_rewards` | Sum of GEM per-step rewards (training reward) |\n| `win_rate` | 1.0 if episode ends with “Congratulations!”, else 0.0 |\n","encoding":"utf-8","truncated":false,"total_bytes":1921},"status":null}