{"data":{"kind":"file","path":"README.md","version_id":"z19lazkakawfvkkdkwlhtc73","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":2279,"modified_at":"2026-05-21T09:33:22.630000","content_hash":"b6ea28d0705c517463bec277266c595ba4c2358fd116c2119d101898446cd3a9"},"entries":[],"content":"# arcee-drug-tool-rl\n\nDrug-discovery tool-use RL environment for Arcee biomedical agent post-training.\n\nThe environment loads tool-calling SFT traces from Hugging Face, extracts the initial user prompt, and turns the reference assistant tool calls into deterministic reward targets.\n\n## Harness\n\n- `vf.StatefulToolEnv`\n- Stateless retrieval tools enabled by default:\n  - `search_pubmed`\n  - `search_geo_datasets`\n  - `search_kegg_by_keyword`\n  - `search_kegg_by_id`\n  - `search_uniprot_by_name`\n  - `search_uniprot_by_id`\n  - `search_string_interactions`\n- Optional NVIDIA-backed tools with `include_nvidia_tools=true`:\n  - `fold_protein`\n  - `dock_ligand`\n  - `molmim_generate`\n\nNVIDIA tools use rollout-local artifacts. `fold_protein` returns an `artifact_id`; `dock_ligand` can consume either raw PDB text or that `artifact_id`. The artifact store lives only inside the rollout state and is not persisted to external storage.\n\n## Rewards\n\n- `tool_selection_reward` (`0.45`): reference tool multiset and order similarity.\n- `tool_argument_reward` (`0.25`): partial argument key/value similarity.\n- `final_answer_reward` (`0.15`): weak reference-answer token overlap.\n- `efficiency_reward` (`0.15`): avoids extra tool calls beyond the reference trace.\n\n## Defaults\n\n- Dataset: `lokahq/drug-tool-sft`\n- Dataset config: `openrouter_trinity_v1`\n- Split: `train`\n- NVIDIA tools: disabled by default.\n- Unsupported traces: dropped if they require tools outside the enabled tool set.\n- No API keys required unless `include_nvidia_tools=true`.\n\n## Required Secrets\n\nWhen NVIDIA tools are enabled, set one of:\n\n- `NVIDIA_API_KEY`\n- `NVCF_RUN_KEY`\n\nOptional endpoint overrides:\n\n- `ESMFOLD_API_URL`\n- `DIFFDOCK_API_URL`\n- `MOLMIM_API_URL`\n\nFor Hosted Training, pass secrets through the training config:\n\n```toml\nenv_file = [\"secrets.env\"]\n\n[[env]]\nid = \"lokahq/arcee-drug-tool-rl\"\nargs = { dataset_name = \"lokahq/drug-tool-sft\", dataset_config = \"openrouter_trinity_v1\", include_nvidia_tools = true }\n```\n\n`secrets.env`:\n\n```bash\nNVIDIA_API_KEY=...\n```\n\n## Local Install\n\n```bash\nprime env install arcee-drug-tool-rl\n```\n\nPass environment args with Prime when needed:\n\n```bash\nprime eval run arcee-drug-tool-rl -a '{\"num_examples\": 20, \"max_turns\": 8, \"include_nvidia_tools\": true}'\n```\n","encoding":"utf-8","truncated":false,"total_bytes":2279},"status":null}