{"data":{"kind":"file","path":"README.md","version_id":"rp40ygq4tomv009jge5p2ltk","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":3663,"modified_at":"2026-06-07T21:29:40.964000","content_hash":"ec8a5132408b94bb2961ad487a4a3b21d5cf1241a7bf01f88a354928c3fb7719"},"entries":[],"content":"# piano-omr-musicxml-lite\n\nSynthetic piano OMR environment for VLM training. It generates simple piano grand-staff images, pairs them with a MusicXML-lite JSON AST, converts that AST to canonical MusicXML, and scores model completions deterministically.\n\nThe v0 task is intentionally narrow: difficulty 1 uses C major, 4/4, one measure, quarter notes only, no rests, no chords, and two staves.\n\n## Install\n\n```bash\npython -m pip install -e \".[dev]\"\n```\n\n## Smoke Commands\n\n```bash\npytest\npython -m piano_omr_musicxml_lite.preview_dataset --seed 0 --difficulty 1 --out sample.png --print-json\npython -m piano_omr_musicxml_lite.mock_eval --n 100 --difficulty 1\npython -m piano_omr_musicxml_lite.generate_sft_dataset --difficulty 1 --train-n 1000 --val-n 100 --out data/sft_d1\npython -m piano_omr_musicxml_lite.eval_model --model Qwen/Qwen3-VL-2B-Instruct --data data/sft_d1/val --limit 20\npython -m piano_omr_musicxml_lite.train_lora_smoke --model Qwen/Qwen3-VL-2B-Instruct --data data/sft_d1 --max-steps 100 --output runs/qwen3vl2b-d1-smoke\npython -m piano_omr_musicxml_lite.eval_model --model runs/qwen3vl2b-d1-smoke --data data/sft_d1/val --limit 50\n```\n\n`eval_model` and `train_lora_smoke` run a deterministic local fallback when Transformers, PEFT, bitsandbytes, a GPU, or the requested model are unavailable. With the training extras installed, they attempt real local VLM inference/training first.\n\n## RLVR Smoke Training\n\nAfter SFT produces parseable JSON, run reward-based fine-tuning from an SFT adapter:\n\n```bash\npython -m piano_omr_musicxml_lite.train_rlvr \\\n  --model runs/qwen3vl2b-dense-rotzoom-d1-d3-5000 \\\n  --data data/sft_dense_rotzoom_d1_d3_aug_9k \\\n  --max-steps 100 \\\n  --sample-limit 256 \\\n  --output runs/qwen3vl2b-rlvr-smoke \\\n  --log-every 5 \\\n  --save-every 25\n```\n\nThis is a compact RLVR smoke loop, not full Prime RLVR yet. It samples completions, scores them with `score_completion`, computes an advantage against a moving reward baseline, and updates the LoRA adapter on sampled tokens. Metrics are written to `rlvr_metrics.jsonl`.\n\n## Output Schema\n\nModels must return JSON only. The AST has `metadata`, `score`, and `measures`. Notes use MusicXML spelling:\n\n```json\n{\n  \"kind\": \"note\",\n  \"staff\": 1,\n  \"voice\": 1,\n  \"pitch\": {\"step\": \"C\", \"alter\": 0, \"octave\": 5},\n  \"onset\": 0,\n  \"duration\": 4,\n  \"type\": \"quarter\"\n}\n```\n\nThe converter owns MusicXML mechanics such as `<backup>` between the treble and bass staves.\n\n## Reward\n\nInvalid JSON receives `0.0`. Otherwise:\n\n```text\n0.10 json_valid\n+ 0.10 schema_valid\n+ 0.15 musicxml_convertible\n+ 0.45 note_event_f1\n+ 0.10 rhythm_valid\n+ 0.10 notation_fidelity\n```\n\nCaps are applied for invalid schema and non-convertible MusicXML.\n\n## Prime Environment Hub\n\nThe package exposes:\n\n```python\nfrom piano_omr_musicxml_lite import load_environment\n```\n\nExpected Prime workflow:\n\n```bash\nuv tool install prime\nprime login\nprime env init piano-omr-musicxml-lite\nuv pip install -e .\npytest\nprime env push\n```\n\nThe environment wrapper is model-agnostic and delegates generation, rendering, prompting, and scoring to pure local modules.\n\n## Recommended Models\n\nPrimary: `Qwen/Qwen3-VL-2B-Instruct`\n\nFallback: `Qwen/Qwen2.5-VL-3B-Instruct`\n\nFor vLLM:\n\n```bash\nVLLM_WORKER_MULTIPROC_METHOD=spawn \\\nvllm serve Qwen/Qwen3-VL-2B-Instruct \\\n  --trust-remote-code \\\n  --max-model-len 8192 \\\n  --limit-mm-per-prompt image=1\n```\n\n## Known Limitations\n\nThe fallback renderer prioritizes readability over engraving quality. Difficulty 1-3 are generated; chords, accidentals beyond C major, multi-page scores, handwritten OMR, dynamics, slurs, and Prime RLVR training are out of scope for this v0 scaffold.\n","encoding":"utf-8","truncated":false,"total_bytes":3663},"status":null}