{"data":{"kind":"file","path":"README.md","version_id":"j034gzvwktg4jh0f06qzsn6s","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":1845,"modified_at":"2026-02-06T11:15:15.432000","content_hash":"52e0f02fdd4445ecdc7abcfff2ea01ec598c52a1bb7d254db88f9d93ea502e9b"},"entries":[],"content":"# math-group\n\n<a href=\"https://github.com/PrimeIntellect-ai/verifiers/tree/main/environments/gsm8k\">\n<img src=\"https://img.shields.io/badge/GitHub-181717?style=for-the-badge&logo=github&logoColor=white\" alt=\"Source Code\">\n</a>\n\n### Overview\n- **Environment ID**: `math-group`\n- **Short description**: Groups GSM8K and MATH single-turn sub-environments into one evaluation with a shared math reasoning-and-answer format.\n- **Tags**: math, group, single-turn, gsm8k, math, think\n\n### Datasets\n- **Primary dataset(s)**: `gsm8k` (subset of 1000 train examples) and `math` (subset of 1000 train examples)\n- **Source links**: Uses example loader in `verifiers.utils.data_utils`\n- **Split sizes**: 1000 per sub-environment (train-only in this group)\n\n### Task\n- **Type**: single-turn (EnvGroup of two SingleTurnEnv instances)\n- **Rubric overview**: Exact numeric/equivalent match per sub-environment plus optional format check\n\n### Quickstart\nRun an evaluation with default settings:\n\n```bash\nprime eval run math-group\n```\n\nConfigure model and sampling:\n\n```bash\nprime eval run math-group \\\n  -m gpt-4.1-mini \\\n  -n 20 -r 3 -t 1024 -T 0.7\n```\n\nNotes:\n- This environment bundles two math datasets; reporting aggregates across both.\n- Reports are written under `./environments/math_group/reports/` and auto-embedded below.\n\n### Environment Arguments\nThis loader does not expose custom arguments.\n\n### Metrics\n| Metric | Meaning |\n| ------ | ------- |\n| `reward` | 1.0 if parsed boxed answer equals target (per sub-env), else 0.0 |\n| `format_reward` | Adherence to `<think>` + boxed `\\boxed{...}` format |\n\n## Evaluation Reports\n\n<!-- Do not edit below this line. Content is auto-generated. -->\n<!-- vf:begin:reports -->\n<p>No reports found. Run <code>prime eval run vf-math-group -a '{\"key\": \"value\"}'</code> to generate one.</p>\n<!-- vf:end:reports -->\n","encoding":"utf-8","truncated":false,"total_bytes":1845},"status":null}