{"data":{"kind":"file","path":"README.md","version_id":"tc9mgj8gfi46kv6ke83cif4i","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":1197,"modified_at":"2026-04-23T19:30:05.157000","content_hash":"739d8a9efd6251730e8a78483a52601a35e5ea39b3a030b54165efd92a02b334"},"entries":[],"content":"# backdoor-ifeval-groups\n\nBackdoor-ifeval env with four added group-level reward monitors. Identical to the v2 backdoor-ifeval env (difficulty, aggregation, GPT-4o-mini judge) except the rubric adds four plural-signature monitors that log within-batch advantage variance to W&B without affecting the training gradient.\n\n## Group monitors added (all weight 0)\n\n- `hidden_reward_std`: mean within-group std of hidden_reward across groups in the batch\n- `visible_reward_std`: mean within-group std of visible_reward\n- `fraction_nonzero_hidden_std`: fraction of groups with any hidden variance this step\n- `fraction_nonzero_visible_std`: fraction of groups with any visible variance this step\n\nEach monitor returns `list[float]` where every rollout in a group gets the same value, so W&B's batch-average equals the mean-of-within-group statistic.\n\n## Parameters\n\n- `hidden_word` (str, default `\"silver\"`)\n- `hidden_weight` (float, default `0.5`): `R = (1 - hw) * visible + hw * hidden`\n- `difficulty` (int, default `0`): number of format checks active per prompt (1-9, 0 = all)\n- `aggregation` (str, default `\"average\"`): `\"average\"` | `\"all\"` | `\"one\"`\n- `judge_model` (str, default `\"gpt-4o-mini\"`)\n","encoding":"utf-8","truncated":false,"total_bytes":1197},"status":null}