{"data":{"kind":"file","path":"README.md","version_id":"xrmcu18kc1ajof0jfz56npwv","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":4009,"modified_at":"2026-02-02T15:27:01.941000","content_hash":"dce2e8e01b8b24b8bfb59e467d5def2a7df9c66fab659355be166906d1dd0c37"},"entries":[],"content":"# OpenMed SOAP Environment\n\nMedical dialogue to SOAP note environment for RL fine-tuning using [omi-health/medical-dialogue-to-soap-summary](https://huggingface.co/datasets/omi-health/medical-dialogue-to-soap-summary) - converting doctor-patient conversations to structured clinical documentation.\n\n## Task Description\n\nGiven a doctor-patient conversation transcript, generate a structured SOAP note that captures the clinical encounter in standard medical documentation format.\n\n## What is SOAP?\n\nSOAP is a standardized format for clinical documentation:\n\n| Section | Name | Content |\n|---------|------|---------|\n| **S** | Subjective | Patient's reported symptoms, complaints, history |\n| **O** | Objective | Measurable findings - vitals, exam, labs, imaging |\n| **A** | Assessment | Clinical diagnosis and interpretation |\n| **P** | Plan | Treatment, medications, referrals, follow-up |\n\n## Dataset\n\n- **Source**: [omi-health/medical-dialogue-to-soap-summary](https://huggingface.co/datasets/omi-health/medical-dialogue-to-soap-summary)\n- **Train**: 9,250 dialogue-SOAP pairs\n- **Validation**: 500 pairs\n- **Test**: 250 pairs\n- **Format**: Dialogue → Structured SOAP note\n\n## Reward Structure\n\n| Component | Weight | Description |\n|-----------|--------|-------------|\n| Structure | 35% | All 4 SOAP sections present (S, O, A, P) |\n| Content Quality | 30% | Each section has meaningful content |\n| Length | 20% | Similar length to reference SOAP |\n| Thinking | 15% | Clinical reasoning in `<think>` tags |\n\n### Structure Reward\n- 1.0: All 4 sections present\n- 0.75: 3 sections\n- 0.5: 2 sections\n- 0.25: 1 section\n- 0.0: No sections\n\n### Content Quality\nRewards sections with adequate word count (30+ words = 1.0, 15+ = 0.7, etc.)\n\n## Example\n\n**Input (Dialogue):**\n```\nDoctor: What brings you in today?\nPatient: I've been having chest pain for the past 3 days.\nDoctor: Can you describe the pain?\nPatient: It's a sharp pain on the left side, gets worse when I breathe deeply.\nDoctor: Any shortness of breath or fever?\nPatient: Yes, mild fever and some cough.\nDoctor: Let me examine you... Your vitals are BP 130/85, HR 88, Temp 100.4°F.\nI hear some crackles in your left lower lung. We'll need a chest X-ray.\n```\n\n**Expected Output:**\n```\n<think>\nChief complaint: chest pain for 3 days, pleuritic (worse with breathing)\nAssociated symptoms: fever, cough - suggests respiratory infection\nExam findings: low-grade fever, crackles in left lung\nLikely diagnosis: pneumonia with pleurisy vs simple pleuritis\nPlan: chest X-ray to confirm, likely antibiotics\n</think>\n\nS: 45-year-old patient presents with 3-day history of sharp left-sided chest pain, worsening with deep inspiration. Associated with mild fever and cough. No prior cardiac history.\n\nO: BP 130/85 mmHg, HR 88 bpm, Temp 100.4°F. Lung exam reveals crackles in left lower lobe. Cardiac exam unremarkable. Chest X-ray pending.\n\nA: Pleuritic chest pain with signs of lower respiratory infection. Differential includes community-acquired pneumonia with pleurisy, viral pleuritis. Low concern for cardiac etiology given pleuritic nature.\n\nP: Obtain chest X-ray. If infiltrate present, start amoxicillin-clavulanate 875mg BID x 7 days. Ibuprofen 400mg TID for pleuritic pain. Return if dyspnea worsens or fever exceeds 102°F.\n```\n\n## Usage\n\n```python\nfrom OpenMed_SOAP import load_environment\n\nenv = load_environment()\n```\n\n## Why SOAP for Medical RL?\n\n1. **Structured output**: Clear format compliance is verifiable\n2. **Clinical relevance**: Real-world medical documentation task\n3. **Multi-component**: Tests information extraction across 4 domains\n4. **Measurable quality**: Can evaluate section presence and content\n\n## Citation\n\n```bibtex\n@misc{omi_health_soap,\n  title={Medical Dialogue to SOAP Summary Dataset},\n  author={{Omi Health}},\n  year={2024},\n  howpublished={HuggingFace Datasets},\n  url={https://huggingface.co/datasets/omi-health/medical-dialogue-to-soap-summary}\n}\n```\n\n## License\n\nApache 2.0 (following omi-health dataset license)\n","encoding":"utf-8","truncated":false,"total_bytes":4009},"status":null}