{"data":{"kind":"file","path":"README.md","version_id":"nsvlxp08wugwf04atif6t2wf","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":2928,"modified_at":"2026-02-01T15:07:00.523000","content_hash":"6695eeb420a70470416a3ef621f22537dc2ce22b01bdb54def75993bdd4e3c1b"},"entries":[],"content":"# OpenMed MedQA USMLE Environment\n\nUSMLE-style medical question answering environment for RL fine-tuning using the MedQA dataset - professional-level medical licensing exam questions.\n\n## Task Description\n\nGiven a medical question with 4 answer choices from the United States Medical Licensing Examination (USMLE), select the correct answer. These questions require professional-level medical knowledge.\n\n## Exam Coverage\n\n| Step | Focus | Topics |\n|------|-------|--------|\n| Step 1 | Basic Sciences | Anatomy, Biochemistry, Pathology, Pharmacology, Physiology, Microbiology |\n| Steps 2&3 | Clinical Knowledge | Diagnosis, Patient Management, Clinical Decision Making |\n\n## Dataset\n\n- **Source**: [GBaker/MedQA-USMLE-4-options](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options)\n- **Original**: [MedQA (Jin et al., 2020)](https://github.com/jind11/MedQA)\n- **Train**: 10,178 questions\n- **Test**: 1,273 questions\n- **Format**: 4-option multiple choice (A, B, C, D)\n\n## Reward Structure\n\n| Component | Weight | Description |\n|-----------|--------|-------------|\n| Accuracy | 80% | Exact match on correct answer (A-D) |\n| Thinking | 15% | Quality of medical reasoning in `<think>` tags |\n| Format | 5% | Proper `\\boxed{}` answer format |\n\n## Example\n\n**Input:**\n```\n[USMLE Step 1 (Basic Sciences)]\n\nQuestion:\nA 45-year-old man presents to the emergency department with severe substernal\nchest pain radiating to his left arm. ECG shows ST-segment elevation in leads\nV1-V4. Which of the following is the most likely diagnosis?\n\nOptions:\nA. Acute anterior wall myocardial infarction\nB. Pericarditis\nC. Pulmonary embolism\nD. Aortic dissection\n```\n\n**Expected Output:**\n```\n<think>\nThis patient presents with classic signs of acute coronary syndrome:\n- Substernal chest pain radiating to the left arm\n- ST-segment elevation in V1-V4 (anterior leads)\n\nLet me analyze each option:\nA. Acute anterior wall MI - ST elevation in V1-V4 indicates anterior wall\n   involvement (LAD territory). This matches the presentation.\nB. Pericarditis - Would show diffuse ST elevation, not localized to anterior leads\nC. Pulmonary embolism - Typically shows right heart strain pattern, not ST elevation\nD. Aortic dissection - Would present with tearing pain, unequal blood pressures\n\nThe localized ST elevation in anterior leads V1-V4 is diagnostic for anterior STEMI.\n</think>\n\\boxed{A}\n```\n\n## Usage\n\n```python\nfrom OpenMed_MedQA_USMLE import load_environment\n\nenv = load_environment()\n```\n\n## Citation\n\n```bibtex\n@article{jin2020disease,\n  title={What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams},\n  author={Jin, Di and Pan, Eileen and Oufattole, Nassim and Weng, Wei-Hung and Fang, Hanyi and Szolovits, Peter},\n  journal={Applied Sciences},\n  volume={11},\n  number={14},\n  pages={6421},\n  year={2021},\n  publisher={MDPI}\n}\n```\n\n## License\n\nMIT License (following MedQA dataset)\n","encoding":"utf-8","truncated":false,"total_bytes":2928},"status":null}