{"data":{"kind":"file","path":"README.md","version_id":"wfga9u3ekr4uiu71nlhj8off","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":3725,"modified_at":"2026-02-05T21:24:08.475000","content_hash":"cfbba6596f2501e4ed03d34cf1d12561a555d4d0e32a64bee69a3051a5adfb09"},"entries":[],"content":"# OpenMed PubMedQA Environment\n\nBiomedical research question answering environment using [PubMedQA](https://huggingface.co/datasets/qiaojin/PubMedQA) - requiring yes/no/maybe answers based on scientific evidence from PubMed abstracts.\n\n## Task Description\n\nGiven a research question and context from a PubMed abstract, determine whether the answer is **yes**, **no**, or **maybe**. This tests scientific reasoning and evidence-based decision making.\n\n## Dataset\n\n- **Source**: [qiaojin/PubMedQA](https://huggingface.co/datasets/qiaojin/PubMedQA)\n- **Training**: 211K questions (pqa_artificial - machine-generated)\n- **Evaluation**: 1K questions (pqa_labeled - expert-validated)\n- **Classes**: 3 (yes, no, maybe)\n- **Domain**: Biomedical research literature\n\n## Answer Distribution\n\n| Answer | Description |\n|--------|-------------|\n| **Yes** | Evidence supports affirmative answer |\n| **No** | Evidence supports negative answer |\n| **Maybe** | Evidence is inconclusive or insufficient |\n\n## Reward Structure\n\n| Component | Weight | Description |\n|-----------|--------|-------------|\n| Accuracy | 50% | Exact match on yes/no/maybe |\n| Partial Match | 10% | Credit for valid answer (even if wrong) |\n| Format | 15% | Proper `\\boxed{}` and `<think>` usage |\n| Thinking | 15% | Quality of scientific reasoning |\n| Evidence Analysis | 10% | Mentions evidence patterns (p-values, results, etc.) |\n\n### Why These Rewards Work\n\n- **Accuracy (50%)**: Verifiable 3-class classification\n- **Partial Match (10%)**: Encourages learning the answer space\n- **Format (15%)**: Structured output for reliable parsing\n- **Thinking (15%)**: Rewards scientific reasoning process\n- **Evidence Analysis (10%)**: Encourages citing statistical/research evidence\n\n## Example\n\n**Input:**\n```\nResearch Question:\nAre group 2 innate lymphoid cells (ILC2s) increased in chronic rhinosinusitis\nwith nasal polyps or eosinophilia?\n\nAbstract Context:\nChronic rhinosinusitis (CRS) is a common heterogeneous disease characterized\nby inflammation of the sinonasal mucosa. ILC2s were markedly increased in\nCRSwNP patients compared to controls (p<0.001). Patients with eosinophilia\nshowed 3-fold higher ILC2 levels. These cells correlated with disease severity\nscores (r=0.72).\n\nBased on the evidence in the abstract, is the answer to this research question\nyes, no, or maybe?\n```\n\n**Expected Output:**\n```\n<think>\nThe abstract presents clear evidence about ILC2 levels in CRS patients.\n\nKey findings:\n- ILC2s were significantly increased in CRSwNP patients (p<0.001)\n- Eosinophilic patients showed 3-fold higher ILC2 levels\n- Strong correlation with disease severity (r=0.72)\n\nThe statistical significance (p<0.001) and effect size (3-fold increase)\nprovide strong evidence supporting an affirmative answer. The correlation\nwith severity further strengthens this conclusion.\n</think>\n\\boxed{yes}\n```\n\n## Usage\n\n```python\nfrom OpenMed_PubMedQA import load_environment\n\nenv = load_environment()\n```\n\n## Why PubMedQA for Medical RL?\n\n1. **Scientific reasoning**: Tests understanding of research evidence\n2. **3-class classification**: Handles uncertainty (maybe) unlike binary QA\n3. **Large scale**: 211K training examples for robust learning\n4. **Expert validation**: 1K expert-labeled examples for reliable evaluation\n5. **Real-world relevance**: Based on actual PubMed research questions\n\n## Citation\n\n```bibtex\n@inproceedings{jin2019pubmedqa,\n  title={PubMedQA: A Dataset for Biomedical Research Question Answering},\n  author={Jin, Qiao and Dhingra, Bhuwan and Liu, Zhengping and Cohen, William and Lu, Xinghua},\n  booktitle={Proceedings of EMNLP-IJCNLP 2019},\n  pages={2567--2577},\n  year={2019}\n}\n```\n\n## License\n\nMIT License (following qiaojin/PubMedQA terms)\n","encoding":"utf-8","truncated":false,"total_bytes":3725},"status":null}