{"data":{"kind":"file","path":"README.md","version_id":"uwvrlgq38yycpk171oaoqwjh","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":3715,"modified_at":"2025-09-21T04:49:06.894000","content_hash":"9f56ca124b193256dd15320f4144f0e10e3baf9c012d278f5d7a9e8fd5608687"},"entries":[],"content":"# browsergym\n\n### Overview\n- **Environment ID**: `browser-miniwob`\n- **Short description**: Browser automation environment using BrowserGym framework with MiniWoB++ dataset for web interaction tasks. The environment provides tools for browser automation and task completion using various action sets.\n- **Tags**: `browser-automation`, `web-interaction`, `miniwob`, `llm-evaluation`\n- **Notes**: Requires MiniWoB++ dataset to be set up. Set the `MINIWOB_URL` environment variable to point to the MiniWoB HTML files.\n\n### Setup\n1. Install dependencies:\n```bash\nuv pip install -e .\nplaywright install chromium\n```\n\n2. Download and set up MiniWoB++:\n```bash\ngit clone https://github.com/Farama-Foundation/miniwob-plusplus.git\ngit -C \"./miniwob-plusplus\" reset --hard 7fd85d71a4b60325c6585396ec4f48377d049838\nexport MINIWOB_URL=\"file://<PATH_TO_MINIWOB_PLUSPLUS_CLONED_REPO>/miniwob/html/miniwob/\"\n```\n\n### Datasets\n- **Primary dataset(s)**: [MiniWoB++](https://miniwob.farama.org)\n- **Source links**: [MiniWoB++ GitHub](https://github.com/Farama-Foundation/miniwob-plusplus)\n\n### Task\n- **Type**: tool-use\n- **Parser**: `vf.ThinkParser`\n- **Rubric overview**: Evaluates the agent's ability to complete web-based tasks using browser automation. Success is determined by task completion and adherence to task requirements.\n\n### Quickstart\nRun an evaluation with default settings:\n\n```bash\nuv run vf-eval browser-miniwob -m \"x-ai/grok-4-fast:free\" -b \"https://openrouter.ai/api/v1\" -k \"OPENROUTER_API_KEY\" -a '{\"miniwob_url\" : \"file://<PATH_TO_MINIWOB_PLUSPLUS_CLONED_REPO>/miniwob/html/miniwob/\"}'\n```\n\nNotes:\n- Use `-a` / `--env-args` to pass environment-specific configuration as a JSON object.\n\n### Environment Arguments\nDocument any supported environment arguments and their meaning:\n\n| Arg                 | Type                         | Default                          | Description                                    |\n| ------------------- | ---------------------------- | -------------------------------- | ---------------------------------------------- |\n| `miniwob_url`       | str                          | `None`                           | URL for MiniWoB tasks                          |\n| `use_html`          | bool                         | `false`                          | Whether to include HTML in observations        |\n| `use_axtree`        | bool                         | `true`                           | Whether to include accessibility tree in observations |\n| `use_screenshot`    | bool                         | `false`                          | Whether to include screenshots in observations |\n| `max_turns`         | int                          | `10`                             | Maximum number of turns                        |\n\n### Metrics\nSummarize key metrics your rubric emits and how they're interpreted:\n\n| Metric         | Meaning                                                                              |\n| -------------- | ------------------------------------------------------------------------------------ |\n| `reward`       | Main scalar reward (based on task completion success)                               |\n| `judge_score`  | 1 if the task is completed successfully, 0 otherwise                                |\n\n### Agent Implementation\nThe environment uses a BrowserGym agent that can interact with web pages through:\n- High-level actions (click, type, scroll, etc.)\n- Accessibility tree navigation\n- DOM manipulation\n- Screenshot analysis\n\n### Credits\nImplementation based on BrowserGym framework and MiniWoB++ dataset.\nOriginal BrowserGym: https://github.com/ServiceNow/BrowserGym\nOriginal MiniWoB++: https://github.com/Farama-Foundation/miniwob-plusplus\n","encoding":"utf-8","truncated":false,"total_bytes":3715},"status":null}