{"data":{"kind":"file","path":"README.md","version_id":"bh6ss1goqyda4u0n9bceg18v","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":2128,"modified_at":"2026-05-29T19:55:44.859000","content_hash":"3821fea121c9d09518911671660b01fed2b5dff5058a8425e9602edce1d1d845"},"entries":[],"content":"# seta\n\n### Overview\n- **Environment ID**: `seta`\n- **Short description**: v1 Harbor-style environment for selected CAMEL-AI SETA terminal-agent tasks.\n- **Tags**: seta, harbor, terminal-agent, v1, eval\n\n### Datasets\n- **Primary dataset(s)**: `camel-ai/SETA-Env`\n- **Source links**: https://huggingface.co/datasets/camel-ai/SETA-Env\n- **Size**: 4567 Harbor-style task directories in the full release. This env defaults to `SETA_Synth/ask_ubuntu__0` for smoke tests.\n\n### Task\n- **Type**: terminal-agent / sandbox task\n- **Harness**: v1 `OpenCode`\n- **Rubric overview**: uploads each Harbor task's `tests/` and `solution/` assets into the sandbox and reads the Harbor verifier reward.\n\n### Quickstart\nRun an evaluation with default settings:\n\n```bash\nprime eval run seta -n 1 -r 1 -c 1 -t 512 -S '{\"enable_thinking\": false}'\n```\n\nConfigure model and sampling:\n\n```bash\nprime eval run seta \\\n  -m poolside/laguna-xs.2 \\\n  -n 1 -r 1 -c 1 -t 512 \\\n  -S '{\"enable_thinking\": false}'\n```\n\nNotes:\n- Put task-owned settings under `[env.taskset]` and harness-owned settings under `[env.harness]` in TOML configs.\n- SETA is large. The loader requires an explicit `task_names` list and downloads only those task directories from Hugging Face.\n- Override the task list from the CLI with `-a '{\"config\":{\"taskset\":{\"task_names\":[\"ask_ubuntu__1\"]}}}'`.\n\n### Taskset Config\n| Field | Type | Default | Description |\n| --- | ---- | ------- | ----------- |\n| `repo_id` | str | `\"camel-ai/SETA-Env\"` | Hugging Face dataset repo. |\n| `subset` | str | `\"SETA_Synth\"` | Top-level SETA subset to read. |\n| `task_names` | list[str] | `[\"ask_ubuntu__0\"]` | Harbor task directory names to download and run. |\n| `cache_dir` | str \\| null | `null` | Optional Hugging Face cache directory. |\n\n### Harness Config\nUses bundled v1 `OpenCodeConfig`. Override fields under `[env.harness]` in TOML.\n\n### Metrics\n| Metric | Meaning |\n| ------ | ------- |\n| `reward` | Main scalar reward from the Harbor verifier |\n| `harbor_reward` | Raw Harbor verifier reward read from `/logs/verifier/reward.*` |\n| `num_turns` | Number of agent turns completed by the harness |\n","encoding":"utf-8","truncated":false,"total_bytes":2128},"status":null}