{"data":{"kind":"file","path":"README.md","version_id":"iwwbpmd1ug95abxd6pes2v93","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":1419,"modified_at":"2026-02-20T23:11:44.205000","content_hash":"82bce66b032ec295b18a0fb6ef9f1c8779e749f37711d4b67fa437b3ab8634da"},"entries":[],"content":"# MedQA\n\nEvaluation environment for the MedQA (USMLE) dataset.\n\n## Overview\n- **Environment ID**: `medqa`\n- **Short description**: Single-turn USMLE-style medical multiple-choice QA\n- **Tags**: medical, single-turn, multiple-choice, usmle, train, eval\n\n## Datasets\n- **Primary dataset(s)**: MedQA USMLE 4-options\n- **Source links**: [GBaker/MedQA-USMLE-4-options](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options)\n- **Split sizes**: Uses provided train and test splits\n\n## Task\n- **Type**: single-turn\n- **Rubric overview**: Binary scoring (1.0 / 0.0) based on correct letter match\n\n## Quickstart\nRun an evaluation with default settings:\n\n```bash\nprime eval run medqa -m \"openai/gpt-5-mini\" -n 5 -s\n```\n\nConfigure model and sampling:\n\n```bash\nmedarc-eval medqa -m \"openai/gpt-5-mini\" -n 5 -s --answer-format boxed\n```\n\n## Authors\nThis environment has been put together by:\n\nAhmed Essouaied - ([@ahmedessouaied](https://github.com/ahmedessouaied))\n\n## Credits\nDataset:\n\n```bibtex\n@misc{jin2020diseasedoespatienthave,\n      title={What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams},\n      author={Di Jin and Eileen Pan and Nassim Oufattole and Wei-Hung Weng and Hanyi Fang and Peter Szolovits},\n      year={2020},\n      eprint={2009.13081},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL},\n      url={https://arxiv.org/abs/2009.13081},\n}\n```\n","encoding":"utf-8","truncated":false,"total_bytes":1419},"status":null}