{"data":{"kind":"file","path":"README.md","version_id":"fw1k1198sm2llnnfh34wqrqj","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":3231,"modified_at":"2026-05-30T04:44:25.878000","content_hash":"5e71033aca5ebf3ec5c5e4f30612826aaf96b075514cf9be5ce4231095545d27"},"entries":[],"content":"# toolathlon-harbor-khalid\n\nKhalid-owned fork of the Toolathlon Harbor environment.\n\nCurrent local repair:\n- `0.1.12` packages the refreshed Khalid task set and keeps debug artifact capture out of the agent prompt path.\n- The package name remains `toolathlon-harbor-khalid`, so pushing this path updates the separate fork, not the shared `toolathlon-harbor` environment.\n\n## Capturing Agent-Created Files\n\nHosted evaluations normally show the transcript and reward, but they do not directly download files created inside `/app` in the sandbox. This fork accepts an optional `debug_artifacts` environment argument and attempts to append selected `/app` files to the captured OpenCode log after the agent exits.\n\nDo not add artifact-emission instructions to task prompts. That approach was removed in `0.1.10` because it could make OpenCode crash before the first LLM call.\n\nThis is intended for debugging future runs. It will not recover files from runs that already completed before this feature was enabled.\n\n### Enable Artifact Capture\n\nTo capture every common artifact file in `/app` (`.xlsx`, `.csv`, `.docx`, `.pptx`, `.pdf`, `.json`, `.txt`, `.md`), pass `debug_artifacts:true` in `-a`:\n\n```bash\nprime --plain eval run khalid-altalib/toolathlon-harbor-khalid \\\n  --hosted \\\n  --follow \\\n  --allow-tunnel-access \\\n  --allow-sandbox-access \\\n  -m poolside/laguna-xs.2 \\\n  -n 1 \\\n  -r 1 \\\n  -t 4096 \\\n  -a '{\"tasks\":[\"yf-stock-overview-scripted\"],\"docker_image\":\"cmppzhull000z12k0rxk6680f/t2h-base:0.1\",\"debug_artifacts\":true}'\n```\n\nTo capture only specific files, pass a list:\n\n```bash\nprime --plain eval run khalid-altalib/toolathlon-harbor-khalid \\\n  --hosted \\\n  --follow \\\n  --allow-tunnel-access \\\n  --allow-sandbox-access \\\n  -m poolside/laguna-xs.2 \\\n  -n 1 \\\n  -r 1 \\\n  -t 4096 \\\n  -a '{\"tasks\":[\"yf-stock-overview-scripted\"],\"docker_image\":\"cmppzhull000z12k0rxk6680f/t2h-base:0.1\",\"debug_artifacts\":[\"YF_Stock_Overview.xlsx\"]}'\n```\n\nThe logs will contain markers like:\n\n```text\nAGENT_ARTIFACT_BEGIN YF_Stock_Overview.xlsx\n...base64 payload...\nAGENT_ARTIFACT_END YF_Stock_Overview.xlsx\n```\n\n### Recover a File From Logs\n\nAfter the run finishes, copy the Evaluation ID from the final `View:` URL, then run:\n\n```bash\nEVAL_ID=\"<evaluation_id>\"\nFILE_NAME=\"YF_Stock_Overview.xlsx\"\n\nprime --plain eval logs \"$EVAL_ID\" --tail 50000 > artifact_debug_logs.txt\n\npython3 - <<'PY'\nimport base64\nimport re\nfrom pathlib import Path\n\nname = \"YF_Stock_Overview.xlsx\"\ntext = Path(\"artifact_debug_logs.txt\").read_text(errors=\"ignore\")\nmatch = re.search(\n    rf\"AGENT_ARTIFACT_BEGIN {re.escape(name)}\\s*(.*?)\\s*AGENT_ARTIFACT_END {re.escape(name)}\",\n    text,\n    re.S,\n)\nif not match:\n    raise SystemExit(f\"No artifact block found for {name}\")\n\npayload = re.sub(r\"[^A-Za-z0-9+/=]\", \"\", match.group(1))\nPath(f\"agent_{name}\").write_bytes(base64.b64decode(payload))\nprint(f\"wrote agent_{name}\")\nPY\n```\n\nIf no block is found, check the `AGENT_ARTIFACT_MANIFEST_BEGIN` section in the logs. It records whether each requested file existed, whether it was emitted, and its size.\n\nRollback:\n- Use the published previous version on Prime if needed.\n- Do not push `pi-envs/toolathlon_harbor/` unless intentionally updating the shared environment.\n","encoding":"utf-8","truncated":false,"total_bytes":3231},"status":null}