{"data":{"kind":"file","path":"README.md","version_id":"t7ps9ktl0ow7nnpckho6pnnn","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":2237,"modified_at":"2026-04-04T18:34:27.560000","content_hash":"6ca957d18b2d27c99837ec3acc76711fb7c129b5e11f9a7a212884817d55eee3"},"entries":[],"content":"# opencode-continual-learning\n\n### Overview\n\n- **Environment ID**: `opencode-continual-learning`\n- **Short description**: Continual learning environment that resumes OpenCode agent sessions from prior rollouts and evaluates whether the agent successfully completes the task.\n- **Tags**: continual-learning, opencode, agent, multi-turn\n\n### Task\n\n- **Type**: multi-turn, agent-based\n- **Output format expectations**: Plain text (agent completion log)\n- **Rubric overview**: Deterministic reward: 1.0 if the agent completes successfully; 0.0 otherwise.\n\n### Quickstart\n\nRun an evaluation with default settings:\n\n```bash\nprime eval run opencode-continual-learning\n```\n\nNotes:\n\n- Use `-a` / `--env-args` to pass environment-specific configuration as a JSON object.\n\n### Environment Arguments\n\n| Arg       | Type | Default                             | Description                                                      |\n| --------- | ---- | ----------------------------------- | ---------------------------------------------------------------- |\n| `dataset` | str  | `\"13point5/opencode-rollouts-test\"` | Hugging Face dataset repo ID containing rollouts (`train.jsonl`) |\n\n### Metrics\n\n| Metric            | Meaning                                                              |\n| ----------------- | -------------------------------------------------------------------- |\n| `reward`          | Deterministic reward: 1.0 if agent succeeded, else 0.0               |\n| `agent_succeeded` | 1.0 if agent completed (exit code 0, no timeout, no error), else 0.0 |\n\n### Dataset Format\n\nThe environment expects a Hugging Face dataset with `train.jsonl` containing rows with:\n\n```json\n{\n  \"session_id\": \"...\",\n  \"agent\": \"...\",\n  \"exported_at\": \"...\",\n  \"metadata\": { \"remote_url\": \"...\" },\n  \"session\": {\n    \"messages\": [\n      {\n        \"info\": { \"role\": \"user\" },\n        \"parts\": [{ \"type\": \"text\", \"text\": \"...\" }]\n      },\n      {\n        \"info\": { \"role\": \"assistant\" },\n        \"parts\": [\n          { \"type\": \"reasoning\" },\n          { \"type\": \"text\" },\n          { \"type\": \"tool\" }\n        ]\n      }\n    ]\n  }\n}\n```\n\nEach row is transformed to extract the prompt (up to and including the last user message) and session state for resumption.\n","encoding":"utf-8","truncated":false,"total_bytes":2237},"status":null}