{"data":{"kind":"file","path":"README.md","version_id":"o9ogkwm2pi60n8qs9sdc2i55","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":1798,"modified_at":"2026-04-29T18:37:05.967000","content_hash":"fb623eacef9a8cf7b152c83e2175ba3e983d03517aae597610ad3374cf582c75"},"entries":[],"content":"# gsm8k-last-number\n\nGSM8K single-turn math environment with a permissive numeric verifier.\n\nThis environment is derived from `will/gsm8k`, but the reward does not require\nboxed answers. It extracts the last numeric expression in the model completion\nand compares it numerically to the dataset answer.\n\nThe default instruction is prepended to the user question instead of being sent\nas a system message:\n\n```text\nReason through the problem step by step. Write plainly. Do not use special mathematical notation. Put the final answer as an ordinary number in the last sentence.\n```\n\nThis avoids LaTeX, boxed-answer instructions, system-role dependence, and other\nnotation that can be unhelpful for older-style language models.\n\nAccepted completion styles include:\n\n```text\nHe must jog at the rate of 5.2 miles per hour.\nThe total reduction was 15%.\nSo the answer is $44.\n```\n\nThe verifier supports integers, decimals, commas, optional currency symbols,\noptional percent signs, trailing punctuation, and simple fractions.\n\n## Usage\n\n```python\nfrom verifiers import load_environment\n\nenv = load_environment(\"gsm8k-last-number\")\n```\n\nWith the Talkie GRPO smoke script:\n\n```bash\npython examples/grpo_lewtun_hf_smoke.py \\\n  --vf-env your-owner/gsm8k-last-number \\\n  --vf-dataset-size 64\n```\n\n## Environment Arguments\n\n| Arg | Type | Default | Description |\n| --- | --- | --- | --- |\n| `num_train_examples` | int | `-1` | Limit training set size (`-1` for all) |\n| `num_eval_examples` | int | `-1` | Limit eval set size (`-1` for all) |\n| `system_prompt` | str or null | `null` | Optional system prompt; by default the instruction is in the user message |\n| `rel_tol` | float | `1e-4` | Relative tolerance for numeric comparison |\n| `abs_tol` | float | `1e-4` | Absolute tolerance for numeric comparison |\n","encoding":"utf-8","truncated":false,"total_bytes":1798},"status":null}