{"data":{"kind":"file","path":"README.md","version_id":"agp90rbl6jy5m5m8g8lj0tmu","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":5941,"modified_at":"2025-08-22T15:12:31.466000","content_hash":"f5664a663ce1c61c973e91c36dd4ee60c4184e243925a2a0603c05157f6503c5"},"entries":[],"content":"# syllabify-en\r\n\r\n### Overview\r\n\r\n- **Environment ID**: `syllabify-en`\r\n- **Short description**: Proto environment for syllable breaking of English. Uses sonority sequencing.\r\n- **Tags**: syllabification, tokenization, NLP, sonority, label-free\r\n\r\n### Motivation\r\n\r\nThis may seem like a toy task, but it exploits a weakness in current tokenizer-based models: to the model, the token orthography (spelling)\r\nis opaque, and there's generally no reason to expect token boundaries to align with syllable boundaries.\r\n\r\nSo for example, if it gets \"bullseye\" as input corresponding to token 123 and needs to generate \"bulls.eye\" corresponding to tokens [456, 78, 910],\r\nit needs to somehow know:\r\n\r\n    - 123 has eight letters\r\n    - 123 has two syllables\r\n    - the first syllable of 123 is five letters long\r\n    - the second syllable of 123 is three letters long\r\n    - 456 has five letters\r\n    - 456 corresponds to the first five letters of 123\r\n    - 910 has three letters\r\n    - 910 corresponds to the last three letters of 123\r\n\r\nBut none of this information is explicitly available to the model.\r\n\r\n### Datasets\r\n\r\n- **Primary dataset(s)**: Default hardcoded samples or any HuggingFace dataset via configuration\r\n- **Source links**: Configurable via `hf_dataset_id` parameter\r\n- **Split sizes**: Depends on selected dataset (uses first available split)\r\n\r\n### Task\r\n\r\n- **Type**: single-turn\r\n- **Parser**: custom\r\n- **Rubric overview**: Single Criterion: Symmetric Damerau-Levenshtein Similarity\r\n\r\n### Quickstart\r\n\r\nRun an evaluation with default settings:\r\n\r\n```bash\r\nuv run vf-eval syllabify-en\r\n```\r\n\r\nConfigure model and sampling:\r\n\r\n```bash\r\nuv run vf-eval syllabify-en -m gpt-4.1-mini -n 4 -r 3 -t 1024 -T 0.7\r\n```\r\n\r\nUse with a HuggingFace dataset:\r\n\r\n```bash\r\nuv run vf-eval syllabify-en -a '{\"hf_dataset_id\": \"google/frames-benchmark\", \"column\": \"Prompt\", \"sep\": \".\"}'\r\n```\r\n\r\nNotes:\r\n\r\n- Use `-a` / `--env-args` to pass environment-specific configuration as a JSON object.\r\n- Reports are written under `./environments/syllabify_en/reports/` and auto-embedded below.\r\n- When using HuggingFace datasets, the environment will automatically syllabify text from the specified column using NLTK's sonority sequencing algorithm.\r\n\r\n### Environment Arguments\r\n\r\n- `hf_dataset_id` (optional): HuggingFace dataset identifier to load instead of using default samples\r\n- `column` (optional): Column name in the dataset containing text to syllabify (required if hf_dataset_id is provided)\r\n- `sep` (optional): Separator to use between syllables (default: \".\")\r\n\r\n### Metrics\r\n\r\n| Metric | Meaning |\r\n| ------ | ------- |\r\n| `indel_accuracy` | Main scalar reward, Damerau-Levenshtein Similarity |\r\n\r\n## Evaluation Reports\r\n\r\n<!-- Do not edit below this line. Content is auto-generated. -->\r\n<!-- vf:begin:reports -->\r\n<details><summary>Reports</summary>\r\n<details><summary>syllabify-en--v0.1.0--model=gpt-4.1-mini--n=4--r=3--args=noargs</summary>\r\n<p><a href=\"reports/syllabify-en--v0.1.0--model=gpt-4.1-mini--n=4--r=3--args=noargs.html\" target=\"_blank\">Open full report</a></p>\r\n<h3>syllabify-en: gpt-4.1-mini (n=4, r=3)</h3>\r\n<div class=\"meta\">\r\n<div><b>Environment</b>: syllabify-en (v0.1.0)</div>\r\n<div><b>Model</b>: <span class=\"code\">gpt-4.1-mini</span></div>\r\n<div><b>Provider</b>: https://api.openai.com/v1</div>\r\n<div><b>Samples</b>: n=4, r=3</div>\r\n<div><b>Date</b>: 2025-08-22</div>\r\n<div><b>Time</b>: 11:12:31</div>\r\n<div><b>Sampling</b>: max_tokens=1024, temperature=0.7</div>\r\n</div>\r\n\r\n<h2>Reward</h2>\r\n<table>\r\n<tr><th>mean</th><th>std</th><th>n</th><th>p5</th><th>p25</th><th>p50</th><th>p75</th><th>p95</th></tr>\r\n<tr>\r\n<td>0.9737</td>\r\n<td>0.0461</td>\r\n<td>12</td>\r\n<td>0.8868</td>\r\n<td>0.9781</td>\r\n<td>1.0</td>\r\n<td>1.0</td>\r\n<td>1.0</td>\r\n</tr>\r\n</table>\r\n\r\n\r\n<h2>Metrics</h2>\r\n<table>\r\n<tr>\r\n<th>metric</th><th>mean</th><th>std</th><th>n</th><th>p5</th><th>p25</th><th>p50</th><th>p75</th><th>p95</th>\r\n</tr>\r\n\r\n<tr>\r\n<td>indel_accuracy</td>\r\n<td>0.9737</td>\r\n<td>0.0461</td>\r\n<td>12</td>\r\n<td>0.8868</td>\r\n<td>0.9781</td>\r\n<td>1.0</td>\r\n<td>1.0</td>\r\n<td>1.0</td>\r\n</tr>\r\n\r\n</table>\r\n\r\n\r\n<h2>Examples <span class=\"muted\">(showing up to 12 of 12)</span></h2>\r\n<div class=\"examples\">\r\n<table>\r\n<tr><th>#</th><th>reward</th><th>indel_accuracy</th><th>completion</th></tr>\r\n\r\n<tr>\r\n<td>0</td>\r\n<td>1.0</td>\r\n<td>1.0</td>\r\n<td><pre>co-quet-tish, flam-boy-ant and chur-lish</pre></td>\r\n</tr>\r\n\r\n<tr>\r\n<td>1</td>\r\n<td>1.0</td>\r\n<td>1.0</td>\r\n<td><pre>the quick brown fox jumps o.ver the la.zy dog</pre></td>\r\n</tr>\r\n\r\n<tr>\r\n<td>2</td>\r\n<td>1.0</td>\r\n<td>1.0</td>\r\n<td><pre>the Rain in spain Stays main&lt;SEP&gt;ly in the plain</pre></td>\r\n</tr>\r\n\r\n<tr>\r\n<td>3</td>\r\n<td>0.8947</td>\r\n<td>0.8947</td>\r\n<td><pre>gre🪲gor sams🪲a a🪲woke one morn🪲ing from un🪲e🪲a🪲sy dreams</pre></td>\r\n</tr>\r\n\r\n<tr>\r\n<td>4</td>\r\n<td>1.0</td>\r\n<td>1.0</td>\r\n<td><pre>co-quet-tish, flam-boy-ant and chur-lish</pre></td>\r\n</tr>\r\n\r\n<tr>\r\n<td>5</td>\r\n<td>1.0</td>\r\n<td>1.0</td>\r\n<td><pre>the quick brown fox jumps o.ver the la.zy dog</pre></td>\r\n</tr>\r\n\r\n<tr>\r\n<td>6</td>\r\n<td>1.0</td>\r\n<td>1.0</td>\r\n<td><pre>the Rain in spain Stays main&lt;SEP&gt;ly in the plain</pre></td>\r\n</tr>\r\n\r\n<tr>\r\n<td>7</td>\r\n<td>0.9123</td>\r\n<td>0.9123</td>\r\n<td><pre>gre🪲gor samsa a🪲woke one morn🪲ing from un🪲ea🪲sy dreams</pre></td>\r\n</tr>\r\n\r\n<tr>\r\n<td>8</td>\r\n<td>1.0</td>\r\n<td>1.0</td>\r\n<td><pre>co-quet-tish, flam-boy-ant and chur-lish</pre></td>\r\n</tr>\r\n\r\n<tr>\r\n<td>9</td>\r\n<td>1.0</td>\r\n<td>1.0</td>\r\n<td><pre>the quick brown fox jumps o.ver the la.zy dog</pre></td>\r\n</tr>\r\n\r\n<tr>\r\n<td>10</td>\r\n<td>1.0</td>\r\n<td>1.0</td>\r\n<td><pre>the Rain in spain Stays main&lt;SEP&gt;ly in the plain</pre></td>\r\n</tr>\r\n\r\n<tr>\r\n<td>11</td>\r\n<td>0.8772</td>\r\n<td>0.8772</td>\r\n<td><pre>gre🪲gor samsa a🪲woke one morn🪲ing from un🪲ease🪲y dreams</pre></td>\r\n</tr>\r\n\r\n</table>\r\n</div>\r\n</details>\r\n</details>\r\n<!-- vf:end:reports -->\r\n","encoding":"utf-8","truncated":false,"total_bytes":5941},"status":null}