{"data":{"kind":"file","path":"README.md","version_id":"qzu05u8ni6ubzx9m1eea3mw3","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":1610,"modified_at":"2026-01-23T03:12:40.812000","content_hash":"ce11489b98f50c1144360eff010a0e66e7d25667b6ddd5c87ce792d0de783fcd"},"entries":[],"content":"# tool-test\n\n<a href=\"https://github.com/PrimeIntellect-ai/verifiers/tree/main/environments/tool_test\">\n<img src=\"https://img.shields.io/badge/GitHub-181717?style=for-the-badge&logo=github&logoColor=white\" alt=\"Source Code\">\n</a>\n\n### Overview\n- **Environment ID**: `tool-test`\n- **Short description**: Sanity-check tool-calling environment that asks models to invoke a random subset of dummy tools.\n- **Tags**: tools, single-turn, function-calling, sanity\n\n### Datasets\n- **Primary dataset(s)**: Synthetic HF dataset generated in-memory with prompts specifying required tools\n- **Source links**: N/A (programmatically generated)\n- **Split sizes**: Controlled by `num_train_examples` and `num_eval_examples`\n\n### Task\n- **Type**: tool use (single-turn ToolEnv)\n- **Parser**: default `Parser`\n- **Rubric overview**: ToolRubric checks tool execution and adds exact match on the required tool set\n\n### Quickstart\nRun an evaluation with default settings:\n\n```bash\nprime eval run tool-test\n```\n\nConfigure model and sampling:\n\n```bash\nprime eval run tool-test \\\n  -m gpt-4.1-mini \\\n  -n 20 -r 3 -t 1024 -T 0.7 \\\n  -a '{\"num_train_examples\": 1000, \"num_eval_examples\": 100}'\n```\n\n\n### Environment Arguments\n| Arg | Type | Default | Description |\n| --- | ---- | ------- | ----------- |\n| `num_train_examples` | int | `1000` | Number of training examples |\n| `num_eval_examples` | int | `100` | Number of evaluation examples |\n\n### Metrics\n| Metric | Meaning |\n| ------ | ------- |\n| `reward` | 1.0 if called tool set equals required set, else 0.0 |\n| ToolRubric metrics | Tool execution success and format adherence |\n","encoding":"utf-8","truncated":false,"total_bytes":1610},"status":null}