{"data":{"kind":"file","path":"README.md","version_id":"e2oxl82svygfpylf0b1kyyay","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":3962,"modified_at":"2026-05-11T22:13:55.944000","content_hash":"3a7c69b243d12098734577ed45ce29853c53978e0c654bdc8010be9e7793a2a0"},"entries":[],"content":"# prior-art-search\n\n### Overview\n- **Environment ID**: `prior-art-search`\n- **Short description**: Tool-use environment for prior-art patent search over a local Chroma database.\n- **Tags**: patents, prior-art, search, tools, chroma\n\n### Datasets\n- **Primary dataset(s)**: HUPD patent records processed into a local Chroma collection.\n- **Source links**: `HUPD/hupd` on Hugging Face.\n- **Synthetic scenarios**: generated with `generate_synthetic_queries.py` and keyed by `publication_number`.\n- **Split sizes**: depends on generation limits.\n\nGenerate scenario rows with the Prime inference endpoint:\n\n```bash\nuv run python environments/prior_art_search/generate_synthetic_queries.py \\\n  --model qwen/qwen3.6-35b-a3b \\\n  --limit 20 \\\n  --concurrency 4\n```\n\nOutputs:\n\n```text\nenvironments/prior_art_search/data/synthetic_patent_queries.jsonl\n```\n\n### Dataset Caveat\nThe current synthetic dataset is a v1 retrieval task, not full real-world novelty\nassessment.\n\nFor each source patent, the generator creates a disguised invention disclosure\nand draft claim from that same patent. The gold answer is still the source\n`publication_number`. So the task is:\n\n```text\nGiven a paraphrased unpublished invention disclosure, retrieve the matching\npatent from the vector database.\n```\n\nThis is useful for training the first environment because it teaches tool-use\nretrieval: search, inspect candidates, and return the matching patent ID.\n\nReal novelty search is stricter. In production, a user usually has a new draft\ninvention and wants older prior-art patents that overlap with it. A stronger v2\ndataset should use gold answers such as cited patents, examiner prior-art\nreferences, or curated nearest older patents, rather than rewarding retrieval of\nthe same source patent used to create the scenario.\n\n### Task\n- **Type**: multi-turn tool use\n- **Tools**: `search_patents`, `lookup_patent`, `return_final_answer`\n- **Rubric overview**: binary reward for returning the gold `publication_number` in `return_final_answer(patent_ids=...)`.\n\n### Quickstart\nPrepare the Chroma database:\n\n```bash\ncd environments/prior_art_search\nuv run python prepare.py --limit 500\n```\n\nInstall the environment:\n\n```bash\nprime env install prior-art-search\n```\n\nRun a small evaluation:\n\n```bash\nprime eval run prior-art-search \\\n  -m qwen/qwen3.6-35b-a3b \\\n  -n 5 -r 1 \\\n```\n\nConfigure model and sampling:\n\n```bash\nprime eval run prior-art-search \\\n  -m qwen/qwen3.6-35b-a3b \\\n  -n 20 -r 3\n```\n\nEvaluate only hard scenarios:\n\n```bash\nprime eval run prior-art-search \\\n  -m qwen/qwen3.6-35b-a3b \\\n  -n 20 \\\n  -r 4 \\\n  -a '{\"difficulty\": \"hard\"}'\n```\n\nNotes:\n- Use `-a` / `--env-args` to pass environment-specific configuration as a JSON object.\n\n### Environment Arguments\n| Arg | Type | Default | Description |\n| --- | ---- | ------- | ----------- |\n| `dataset_path` | str | `data/synthetic_patent_queries.jsonl` | Synthetic scenario JSONL keyed by `publication_number` |\n| `chroma_dir` | str | `.chroma_db` | Local Chroma persistence directory |\n| `collection_name` | str | `patent_collection` | Chroma collection to search |\n| `max_examples` | int | `-1` | Limit on dataset size (use -1 for all) |\n| `max_turns` | int | `6` | Maximum tool-use turns per rollout |\n| `difficulty` | str | `None` | Optional filter, e.g. `easy`, `medium`, or `hard` |\n\n### Metrics\n| Metric | Meaning |\n| ------ | ------- |\n| `correct_patent_returned` | `1.0` if final `patent_ids` include the gold publication number |\n| `speed_bonus` | Small bonus for correct answers that finish in fewer turns; wrong answers get `0.0` |\n| `returned_any_patent` | Diagnostic metric for whether the model used the final-answer tool with IDs |\n| `num_turns` | Number of model/tool turns |\n| `*_calls` | Tool call counts from Verifiers' tool monitor rubric |\n\nThe main reward is:\n\n```text\ncorrect_patent_returned + 0.1 * speed_bonus\n```\n\n`speed_bonus` is gated by correctness, so a fast wrong answer does not receive extra reward.\n","encoding":"utf-8","truncated":false,"total_bytes":3962},"status":null}