{"data":{"kind":"file","path":"README.md","version_id":"m9m0y885xbrync3y5w5sy2fg","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":2816,"modified_at":"2026-02-01T11:30:50.494000","content_hash":"3a4f2d7f903d5b550ad462fd81156a7ad582e62d90cf0878dd989b993a9d9115"},"entries":[],"content":"# OpenMed BioRED Environment\n\nBiomedical Relation Extraction environment for RL fine-tuning using the BioRED corpus from NCBI.\n\n## Task Description\n\nGiven a biomedical abstract and a pair of entities (genes, diseases, chemicals, variants), classify the type of relation between them:\n\n| Class | Label | Description |\n|-------|-------|-------------|\n| A | Positive_Correlation | Entities are positively correlated |\n| B | Negative_Correlation | Entities are negatively correlated |\n| C | Association | General association without specific direction |\n| D | Bind | Physical binding interaction (chemical-protein) |\n| E | Drug_Interaction | Pharmacological interaction between drugs |\n| F | Cotreatment | Co-administration or combined treatment |\n| G | Comparison | Comparative relationship |\n| H | Conversion | Metabolic or chemical conversion |\n| I | No_Relation | No meaningful biological relation |\n\n## Dataset\n\n- **Source**: [YufeiHFUT/bioRED](https://huggingface.co/datasets/YufeiHFUT/bioRED)\n- **Original**: [NCBI BioRED](https://ftp.ncbi.nlm.nih.gov/pub/lu/BioRED/)\n- **Train**: 3,830 examples\n- **Validation**: 1,110 examples\n- **Test**: 990 examples\n- **Entity Types**: GeneOrGeneProduct, DiseaseOrPhenotypicFeature, ChemicalEntity, SequenceVariant\n\n## Reward Structure\n\n| Component | Weight | Description |\n|-----------|--------|-------------|\n| Accuracy | 75% | Exact match on relation type (A-I) |\n| Thinking | 20% | Quality of biomedical reasoning in `<think>` tags |\n| Format | 5% | Proper `\\boxed{}` answer format |\n\n## Example\n\n**Input:**\n```\nTitle: PCSK9 mutations and cardiovascular disease\n\nAbstract:\nWe identified a novel mutation D374Y in the PCSK9 gene in patients with\nautosomal dominant hypercholesterolemia. Functional studies showed that\nthis mutation leads to gain-of-function, resulting in increased LDL-C levels...\n\nEntity 1: PCSK9 (GeneOrGeneProduct)\nEntity 2: hypercholesterolemia (DiseaseOrPhenotypicFeature)\n```\n\n**Expected Output:**\n```\n<think>\nThe abstract discusses a mutation in the PCSK9 gene (Entity 1) found in patients\nwith autosomal dominant hypercholesterolemia (Entity 2). It states that the D374Y\nmutation causes a \"gain-of-function\" leading to \"increased LDL-C levels\". This\nindicates a positive correlation - the mutated gene causes/increases disease risk.\n</think>\n\\boxed{A}\n```\n\n## Usage\n\n```python\nfrom OpenMed_BioRED import load_environment\n\nenv = load_environment()\n```\n\n## Citation\n\n```bibtex\n@article{luo2022biored,\n  title={BioRED: A Rich Biomedical Relation Extraction Dataset},\n  author={Luo, Ling and Lai, Po-Ting and Wei, Chih-Hsuan and Arighi, Cecilia N and Lu, Zhiyong},\n  journal={Briefings in Bioinformatics},\n  volume={23},\n  number={5},\n  pages={bbac282},\n  year={2022},\n  publisher={Oxford University Press}\n}\n```\n\n## License\n\nPublic Domain (NCBI/NIH)\n","encoding":"utf-8","truncated":false,"total_bytes":2816},"status":null}