{"data":{"kind":"file","path":"README.md","version_id":"rnuybn1bpg4oz7wkvno2il3u","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":3846,"modified_at":"2026-02-26T23:21:31.194000","content_hash":"f446346207959fbf6981d1294a20ff82a4b5b7afbbe92ff92bffe9c685729d8b"},"entries":[],"content":"# calendar-scheduling\n\nProcedural `StatefulToolEnv` for meeting-time negotiation under realistic calendar constraints.\n\n## Overview\n\n- **Environment ID**: `calendar-scheduling`\n- **Type**: multi-turn tool use (`StatefulToolEnv`)\n- **Objective**: submit one meeting window that maximizes weighted attendee utility while satisfying hard constraints\n- **Reward**:\n  - `0.0` if final submission is invalid or missing\n  - otherwise weighted average attendee utility in `[0, 1]`\n\nEach task contains:\n\n- attendees with busy calendars,\n- required and optional participants,\n- timezone offsets,\n- hard local-time bounds (for a subset of attendees),\n- soft preferences (day, early/late, back-to-back),\n- room availability,\n- fixed meeting duration.\n\nAttendee importance weights are normalized to sum to `1.0` per task.\n\n## Tools\n\n- `check_attendee_calendar(attendee_id, day_index)`\n- `view_attendee_constraints(attendee_id)`\n- `check_proposal(day_index, start_time_utc, duration_minutes, room_id)`\n- `submit_window(day_index, start_time_utc, duration_minutes, room_id)`\n\nAll tool responses include `remaining_turns`. `check_proposal` is limited by a per-task score-check budget to discourage brute-force probing.\n\n## Deterministic Oracle\n\nGeneration uses deterministic rejection sampling plus exhaustive search over candidate windows to compute:\n\n- `optimal_score`\n- `best_proposals`\n- valid candidate count and search diagnostics\n\nThis metadata is stored with each example and used for metrics (for example submission-to-optimal ratio).\n\n## Quickstart\n\nInstall local environment:\n\n```bash\nprime env install calendar-scheduling\n```\n\nRun a small eval:\n\n```bash\nprime eval run calendar-scheduling -n 5 -r 1 -m qwen3-30b-i -e configs/endpoints.toml --skip-upload\n```\n\nRun with custom environment args:\n\n```bash\nprime eval run calendar-scheduling -n 8 -r 1 -m qwen3-30b-i -e configs/endpoints.toml --skip-upload -a '{\"difficulty\": \"hard\", \"max_turns\": 12, \"generator_overrides\": {\"score_check_budget\": 3}}'\n```\n\n## Standalone TUI\n\nRender a generated problem in terminal:\n\n```bash\nuv run --project environments/calendar_scheduling calendar-scheduling-tui --difficulty medium --seed 42 --show-oracle\n```\n\nThe TUI shows attendees, room lanes, busy blocks, and highlights oracle windows when `--show-oracle` is enabled.\n\n## Environment Arguments\n\n| Arg | Type | Default | Description |\n| --- | ---- | ------- | ----------- |\n| `difficulty` | `str` | `\"medium\"` | High-level task preset (`easy`, `medium`, `hard`) |\n| `num_train` | `int` | `512` | Number of generated training tasks |\n| `num_eval` | `int` | `128` | Number of generated evaluation tasks |\n| `num_examples` | `int \\| None` | `None` | Backwards-compatible alias: sets both train and eval sizes |\n| `seed` | `int` | `7` | Base deterministic seed |\n| `max_turns` | `int \\| None` | `None` | Turn cap per rollout (preset default when unset) |\n| `generator_overrides` | `dict \\| None` | `None` | Fine-grained generation overrides for `GenerationConfig` fields |\n\nCommon `generator_overrides` keys:\n\n- `attendee_count_range`\n- `window_days_range`\n- `meeting_duration_choices`\n- `score_check_budget`\n- `min_valid_candidates`\n- `max_valid_ratio`\n- `max_random_baseline_score`\n- `max_generation_attempts`\n\n## Metrics\n\n| Metric | Meaning |\n| ------ | ------- |\n| `reward` | Final reward (submitted valid score or `0.0`) |\n| `submission_made` | Whether agent called `submit_window` |\n| `submission_valid` | Whether submitted window passed hard constraints |\n| `oracle_optimal_score` | Exhaustive best possible score for the task |\n| `submitted_to_optimal_ratio` | `submitted_score / oracle_optimal_score` (clamped) |\n| `optimality_gap` | `oracle_optimal_score - submitted_score` |\n| `score_checks_used` | Number of `check_proposal` calls consumed |\n| `score_checks_remaining` | Remaining proposal-check budget |\n","encoding":"utf-8","truncated":false,"total_bytes":3846},"status":null}