{"data":{"kind":"file","path":"README.md","version_id":"supuskjyms2ahpwdtj5ci70p","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":2637,"modified_at":"2025-08-24T14:31:45.733000","content_hash":"35ed8b482c44516688e36a28afa84218a8e907122eb8bacf200f12c8869fadb6"},"entries":[],"content":"# OpenMed_MedQA\n\n### Overview\n- **Environment ID**: `OpenMed_MedQA`\n- **Short description**: Single-turn medical multiple-choice QA (USMLE-style, 4 options) on MedQA with chain-of-thought and a final decision as a boxed letter `\\boxed{A|B|C|D}`.\n- **Tags**: openmed, medqa, usmle, multiple-choice, single-turn, think, boxed-letter, train, eval\n\n### Dataset\n- **Source**: `GBaker/MedQA-USMLE-4-options`\n- **Splits**: Uses `test` or `validation` if present; otherwise creates a train/eval holdout from `train` via `train_test_split` (seed=42).\n- **Fields used**:\n  - `question`: the stem.\n  - `options`: list of four strings (fallbacks: `choices`/`choice`, or A–D keys).\n  - `answer_idx` / `answer` / `label`: correct option (commonly a letter in `{A,B,C,D}`), standardized to letter and stored as `answer`.\n\n### Prompting & Schema\n- **System message**: Instructs to reason inside `<think>...</think>` and put the final choice letter in `\\boxed{...}` using exactly one token from `{A,B,C,D}`.\n- **User message**: Contains the `Question:` and an `Options` list formatted as `A) ...` to `D) ...`.\n- **Example schema per example**:\n  - `prompt`: list of messages `[{\"role\":\"system\",...}, {\"role\":\"user\",...}]`\n  - `options`: list of 4 option strings\n  - `answer_letter`: one of `A|B|C|D`\n  - `answer_idx`: integer index (0–3), derived from the letter\n  - `answer`: letter (e.g., `\"C\"`)\n\n### Parser & Rewards\n- **Parser**: `ThinkParser` with `extract_boxed_answer` to read the final letter from `\\boxed{...}`.\n- **Rewards**:\n  - `correct_index_reward_func` (weight 1.0): 1.0 if parsed letter equals `answer_letter` (numeric 0–3 is also accepted and mapped), else 0.0.\n  - `parser.get_format_reward_func()` (weight 0.0): optional format adherence (not counted).\n\n### Environment Arguments\n| Arg | Type | Default | Description |\n| --- | ---- | ------- | ----------- |\n| `num_train_examples` | int | `-1` | Limit training set size (`-1` for all) |\n| `num_eval_examples` | int | `-1` | Limit eval set size (`-1` for all) |\n\n### Quickstart\n\nEvaluate with defaults (uses the env’s internal dataset handling):\n\n```bash\nuv run vf-eval OpenMed_MedQA \\\n  -a '{\"num_train_examples\":-1, \"num_eval_examples\":-1}'\n```\n\nNotes:\n- Use `-a` / `--env-args` to pass environment-specific configuration as a JSON object.\n- Reports (if produced) will be placed under `./environments/OpenMed_MedQA/reports/`.\n\n## Evaluation Reports\n\n<!-- Do not edit below this line. Content is auto-generated. -->\n<!-- vf:begin:reports -->\n<p>No reports found. Run <code>uv run vf-eval OpenMed_MedQA -a '{\"key\": \"value\"}'</code> to generate one.</p>\n<!-- vf:end:reports -->\n\n","encoding":"utf-8","truncated":false,"total_bytes":2637},"status":null}