{"data":{"kind":"file","path":"README.md","version_id":"fbzfnn1542clqc97hcn51u7z","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":2360,"modified_at":"2026-03-17T19:49:56.261000","content_hash":"bf1ca817eea8fa0bcadbe6687b87b95ef52df25ec8d26943bcca063d58910ce6"},"entries":[],"content":"# aim-labs-env\n\nMultimodal aim training environment for vision-language model evaluation and RL training.\n\n## Overview\n\n| | |\n|---|---|\n| **Environment ID** | `aim-labs-env` |\n| **Tags** | `multi-turn`, `multimodal`, `vision`, `tool-use`, `train`, `eval` |\n| **Screen Resolution** | 1024x768 pixels |\n| **Default Turns** | 20 per game |\n\n## Task\n\nEach rollout is a game where the agent must click on red target circles:\n\n1. Agent sees a 1024x768 image with a red target at a random position\n2. Agent calls `click(x=..., y=...)` to click on the target\n3. Hit = click within target radius, Miss = click outside\n4. Reward = hits / attempts (accuracy from 0.0 to 1.0)\n\n## Tool\n\n```python\nclick(x: int, y: int)\n```\n\n| Parameter | Type | Description |\n|-----------|------|-------------|\n| `x` | int | Horizontal position (0 = left edge, 1024 = right edge) |\n| `y` | int | Vertical position (0 = top edge, 768 = bottom edge) |\n\n## Quickstart\n\n```bash\n# Basic evaluation\nprime eval run aim-labs-env -m qwen/qwen3-vl-235b-a22b-instruct\n\n# With options\nprime eval run aim-labs-env \\\n  -m openai/gpt-4o \\\n  -n 10 -r 3 \\\n  -a '{\"difficulty\": \"easy\", \"max_turns\": 10}'\n\n# Demo mode (saves click visualizations)\nprime eval run aim-labs-env \\\n  -m qwen/qwen3-vl-235b-a22b-instruct \\\n  -n 1 -r 1 \\\n  -a '{\"demo\": true, \"max_turns\": 5}'\n```\n\n## Environment Arguments\n\n| Argument | Type | Default | Description |\n|----------|------|---------|-------------|\n| `difficulty` | str | `\"medium\"` | Target size preset |\n| `max_turns` | int | `20` | Targets per game |\n| `num_examples` | int | `100` | Number of game sessions |\n| `seed` | int | `None` | Random seed |\n| `demo` | bool | `False` | Save click visualization images |\n| `demo_output` | str | `\"./demo_output\"` | Directory for demo images |\n\n## Difficulty Levels\n\n| Difficulty | Target Radius | Use Case |\n|------------|---------------|----------|\n| `easy` | 100px | Basic multimodal capability testing |\n| `medium` | 60px | Standard difficulty |\n| `hard` | 35px | Precise coordinate estimation |\n\n## Metrics\n\n| Metric | Description |\n|--------|-------------|\n| `reward` | Accuracy (hits / attempts) |\n| `total_hits` | Successful clicks |\n| `total_attempts` | Total clicks made |\n| `average_distance` | Mean distance from click to target center (px) |\n\n## Example Interaction\n\n```\nUser: Turn 1/20 - Click the target: [image]","encoding":"utf-8","truncated":false,"total_bytes":2360},"status":null}