{"data":{"kind":"file","path":"README.md","version_id":"wtf7suezyegdpwlkl5o8utex","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":2067,"modified_at":"2026-06-20T21:09:33.083000","content_hash":"007876ce45332c42ee39c65d5f891b7d1afa4a6cf1024fa821f302feaea1edc1"},"entries":[],"content":"# ru-knowledge\n\nSelf-contained Russian-language multiple-choice **knowledge & reasoning** evaluation\nfor the [verifiers](https://github.com/PrimeIntellect-ai/verifiers) framework.\n\n## Overview\n- **Environment ID:** `ru-knowledge`\n- **Type:** single-turn (`SingleTurnEnv`)\n- **Task:** the model is shown a Russian question with four numbered options and must\n  reply with the number of the correct option (1–4).\n- **Reward:** `correct_choice` — `1.0` if the resolved choice equals the gold option,\n  else `0.0`.\n- **Tags:** `eval`, `russian`, `multiple-choice`, `knowledge`\n\n## Datasets\n- **Primary dataset:** embedded in the module — 24 hand-written, verified items spanning\n  general knowledge, language, and arithmetic/logic.\n- **Source:** none (self-contained). Established Russian MCQ corpora\n  (`NLPCoreTeam/mmlu_ru`, `RussianNLP/russian_super_glue`) ship as legacy dataset\n  *scripts*, which `datasets>=4` refuses to load; a compact, dependency-free set keeps\n  the eval reproducible and download-free.\n- **Splits:** 24 items, used for eval (`dataset` and `eval_dataset` are the same set).\n\n## Task\n- **Type:** single-turn.\n- **Output format:** a bare option number (`1`–`4`).\n- **Answer parsing (`extract_choice`):** robust to realistic output — takes the **last\n  boundary-anchored** digit 1–4 (so `\"не 1, а 2\"` → `2`, and multi-digit / decimal\n  tokens like `10`, `366`, `0,901` are not mis-read), and **falls back to option-text\n  matching** when the model answers with the option value instead of its number.\n- **Rubric:** a single `correct_choice` exact-match reward (weight 1.0); the framework\n  additionally reports a weight-0 `num_turns` metric.\n\n## Quickstart\n```bash\nprime env install ru-knowledge\nprime eval run ru-knowledge -n 24 -r 3 -m <model>\n```\n\n## Environment Arguments\nNone. `load_environment` accepts no environment-specific arguments.\n\n## Metrics\n| Metric | Meaning |\n| ------ | ------- |\n| `correct_choice` | `1.0` exact-match of the chosen option, else `0.0` (weight 1.0) |\n| `num_turns` | framework metric (weight 0) |\n","encoding":"utf-8","truncated":false,"total_bytes":2067},"status":null}