{"data":{"kind":"file","path":"README.md","version_id":"sexxt5c26s42i87y7zzoagur","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":1649,"modified_at":"2026-05-26T10:29:10.742000","content_hash":"f9d43c3de7b374e11bcc3d7c0d5815854dc7becead5a15ade96ccc53ebb2be6f"},"entries":[],"content":"# frontierscience\n\nFrontierScience single-turn evaluation environment using the OpenAI `openai/frontierscience` test split and a fixed Pinference judge (`openai/gpt-5.4-mini`).\n\n## Quickstart\n\n```bash\nuv pip install -e ./environments/frontierscience\nuv run vf-eval frontierscience -n 1 -r 1 -m openai/gpt-5.4-mini\n```\n\n## Taskset Configuration\n\nPass these under `config.taskset`.\nExisting scripts may also pass `subject_filter` and `judge_model` as top-level `load_environment` or `load_taskset` arguments.\n\n```python\nimport verifiers as vf\n\nenv = vf.load_environment(\n    \"frontierscience\",\n    config={\n        \"taskset\": {\n            \"subject_filter\": \"physics\",\n            \"judge_model\": \"openai/gpt-5.4-mini\",\n        }\n    },\n)\n```\n\n```bash\nuv run vf-eval frontierscience -a '{\"config\":{\"taskset\":{\"subject_filter\":\"physics\",\"judge_model\":\"openai/gpt-5.4-mini\"}}}'\n```\n\n```toml\n[config.taskset]\nsubject_filter = \"physics\"\njudge_model = \"openai/gpt-5.4-mini\"\n```\n\n| Field | Type | Default | Description |\n| --- | --- | --- | --- |\n| `subject_filter` | `\"physics\" \\| \"chemistry\" \\| \"biology\" \\| None` | `None` | Optional subject filter. |\n| `judge_model` | `str` | `\"openai/gpt-5.4-mini\"` | Pinference model used as judge. |\n\nRequires `PRIME_API_KEY` for the Pinference judge. Team billing uses `PRIME_TEAM_ID` or the Prime CLI config.\n\n### Changelog\n\n#### v0.1.3\n- Use Verifiers client setup for the Pinference judge so Prime team billing context is included.\n- Load via the current Verifiers V1 taskset/config shape while preserving top-level `subject_filter` and `judge_model` loader/taskset arguments.\n- Require `verifiers>=0.1.15.dev11`.\n","encoding":"utf-8","truncated":false,"total_bytes":1649},"status":null}