{"data":{"kind":"file","path":"README.md","version_id":"fw6be1qac6feum2igxynd43z","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":1445,"modified_at":"2026-02-18T00:44:14.907000","content_hash":"51967419ba11d51824c4ea67d30d70dd612febd04772e4757780daf0fe80bf36"},"entries":[],"content":"# Blind Navigator\n\nMiniGrid navigation with memory-wipe constraint for active context management.\n\n## Concept\n\nCombines two techniques:\n1. **Memory Wipe**: context wiped each turn, agent only sees its previous output + current observation\n2. **MiniGrid Navigation**: coordinate observations, multi-action generation, milestone shaping\n\nThe agent maintains a STATE block as its recurrent hidden state (map, inventory, goal, plan).\n\n## Physical AI Relevance\n\n- Object Permanence: remember objects no longer visible\n- Temporal Compositionality: find key, remember, navigate to door, use key\n- Mental Mapping: build spatial map from partial observations\n- Planning Under Uncertainty: route to previously-discovered locations\n\n## Quickstart\n\n```bash\npip install -e environments/blind_navigator\nprime env eval blind-navigator -m gpt-4o -n 5 -r 3\n```\n\n## Arguments\n\n| Argument | Default | Description |\n|---|---|---|\n| env_ids | LockedRoom | MiniGrid environment IDs |\n| max_turns | 30 | Max LLM generations per episode |\n| agent_view_size | 7 | Partial observation window (smaller = harder) |\n| use_milestone_rewards | True | Game-specific shaping rewards |\n| state_reward_weight | 0.3 | Weight for STATE block maintenance |\n\n## Reward Components\n\n- **Native** (7.5): MiniGrid completion reward\n- **Format** (0.2): Correct XML parsing\n- **State Maintenance** (0.3): Non-trivial STATE block present\n- **Milestone** (0.5): Key pickup, door unlock, etc.\n","encoding":"utf-8","truncated":false,"total_bytes":1445},"status":null}