{"data":{"kind":"file","path":"README.md","version_id":"hkeyjfe35sfr18654cornlii","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":1903,"modified_at":"2025-09-07T14:44:19.343000","content_hash":"ee052a82b58c15bfa7f523ed8828c39331efb00e466197ae484037d4f077d473"},"entries":[],"content":"# Clevr-CoGenT-TrainA-R1\n\n### Overview\n- **Environment ID**: `Clevr-CoGenT-TrainA-R1`\n- **Short description**: Environment of the train dataset of the same name by the R1-V team. Find their work: [R1-V](https://github.com/StarsfieldAI/R1-V)\n- **Tags**: R1-V, train, reasoning, vision \n\n### Datasets\n- **Primary dataset(s)**: MMInstruction/Clevr_CoGenT_TrainA_R1 for vision Related Reasoning.\n- **Source links**: [MMInstruction/Clevr_CoGenT_TrainA_R1](https://huggingface.co/datasets/MMInstruction/Clevr_CoGenT_TrainA_R1)\n```\n@misc{chen2025r1v,\n  author       = {Chen, Liang and Li, Lei and Zhao, Haozhe and Song, Yifan and Vinci},\n  title        = {R1-V: Reinforcing Super Generalization Ability in Vision-Language Models with Less Than \\$3},\n  howpublished = {\\url{https://github.com/Deep-Agent/R1-V}},\n  note         = {Accessed: 2025-02-02},\n  year         = {2025}\n}\n```\n- **Split sizes**: 37.8k\n\n### Task\n- **Type**: Custom\n- **Parser**: ThinkParser\n- **Rubric overview**: accuracy, think format\n\n### Quickstart\nRun an evaluation with default settings:\n\n```bash\nuv run vf-eval Clevr-CoGenT-TrainA-R1\n```\n\nConfigure model and sampling:\n\n```bash\nuv run vf-eval Clevr-CoGenT-TrainA-R1   -m gpt-4.1-mini   -n 20 -r 3 -t 1024 -T 0.7   -a '{\"num_examples\": \"10\"}'  \n```\n\nNotes:\n- Use `-a` / `--env-args` to pass environment-specific configuration as a JSON object.\n\n### Environment Arguments\nDocument any supported environment arguments and their meaning. Example:\n\n| Arg | Type | Default | Description |\n| --- | ---- | ------- | ----------- |\n| `num_examples` | int | `None` | Limit on dataset size (use -1 for all) |\n\n### Metrics\nSummarize key metrics your rubric emits and how they’re interpreted.\n\n| Metric | Meaning |\n| ------ | ------- |\n| `reward` | Main scalar reward (weighted sum of criteria) |\n| `accuracy` | Exact match on target answer |\n| `think format` | Adherence to thinking format |\n\n","encoding":"utf-8","truncated":false,"total_bytes":1903},"status":null}