{"data":{"kind":"file","path":"README.md","version_id":"mcu99ym5fsjbh93cf8sg0f64","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":12340,"modified_at":"2025-10-30T05:07:19.687000","content_hash":"13cecfa0f85bf84f2f9182e96e2f741f9d5e3143ddcd41e23663da19f57c953e"},"entries":[],"content":"# Security Configuration Verification (E2)\n\nA tool-using RL environment for training and evaluating models on infrastructure configuration auditing. Models analyze Kubernetes and Terraform configurations, detect security violations using real security tools, and generate patches to fix issues.\n\n## Overview\n\nThis environment implements end-to-end configuration security auditing with tool grounding, combining static analysis tools with intelligent patch generation and validation.\n\n**Environment Type**: `ToolEnv` - Multi-turn environment with tool access\n**Task**: Detect security violations and generate fixes for infrastructure configurations\n**Tools**: OPA (Open Policy Agent), KubeLinter, Semgrep\n**Reward Structure**: Severity-weighted detection accuracy + successful patch generation\n\n## Dataset Access\n\n**Public Metadata**: Browse sampling information, dataset composition, and tool versions at:\n\n- <https://huggingface.co/datasets/intertwine-ai/security-verifiers-e2-metadata>\n\n**Full Dataset**: Private to prevent training contamination. Request access via:\n\n- [GitHub Issues](https://github.com/intertwine/security-verifiers/issues) with title \"Dataset Access Request: E2\"\n- Include: name, affiliation, research purpose, HuggingFace username\n\nThe public metadata repo includes detailed model cards explaining the privacy rationale, tool versions (KubeLinter, Semgrep, OPA), and dataset composition. Multi-turn evaluation shows models achieve ~0.93 reward with tool calling vs ~0.62 without tools.\n\n### Dataset Loading Strategies\n\nThis environment supports **multi-tiered dataset loading** for flexibility across different deployment scenarios:\n\n1. **Local datasets** (built with `make data-e2-local`)\n2. **HuggingFace Hub** (with `HF_TOKEN` authentication)\n3. **Builtin fixtures** (for testing without data dependencies)\n\n#### Loading Modes\n\n```python\nimport verifiers as vf\n\n# Auto mode (default): Try local → hub → builtin\nenv = vf.load_environment(\"sv-env-config-verification\")\n\n# Local only: Require local dataset\nenv = vf.load_environment(\"sv-env-config-verification\", dataset_source=\"local\")\n\n# Hub only: Load from HuggingFace\nenv = vf.load_environment(\"sv-env-config-verification\", dataset_source=\"hub\")\n\n# Synthetic only: Use builtin fixtures (no data needed)\nenv = vf.load_environment(\"sv-env-config-verification\", dataset_source=\"synthetic\")\n\n# Select specific dataset\nenv = vf.load_environment(\n    \"sv-env-config-verification\",\n    dataset_name=\"k8s-labeled-v1.jsonl\",  # Kubernetes only\n    dataset_source=\"local\"\n)\n```\n\n#### Using Your Own HuggingFace Repository\n\nIf you've built and pushed datasets to your own HuggingFace repository:\n\n```python\nimport os\n\n# Configure custom repository\nos.environ[\"HF_TOKEN\"] = \"hf_your_token_here\"\nos.environ[\"E2_HF_REPO\"] = \"your-org/security-verifiers-e2-private\"\n\n# Load from your repository\nenv = vf.load_environment(\n    \"sv-env-config-verification\",\n    dataset_source=\"hub\",\n    max_examples=100\n)\n```\n\n**See [docs/user-dataset-guide.md](../../docs/user-dataset-guide.md) for instructions on building and pushing datasets to your own HuggingFace repository.**\n\n## Installation\n\nInstall the environment using the Prime CLI:\n\n```bash\nprime env install intertwine/sv-env-config-verification\n```\n\nOr using pip directly:\n\n```bash\npip install sv-env-config-verification\n```\n\n### Tool Dependencies\n\nThis environment requires security scanning tools. Install them based on your platform:\n\n**macOS:**\n\n```bash\n# Install kube-linter\nbrew install kube-linter\n\n# Install OPA\nbrew install opa\n\n# Install Semgrep\nbrew install semgrep\n```\n\n**Linux/Other:**\n\n```bash\n# Install kube-linter\nwget https://github.com/stackrox/kube-linter/releases/download/v0.6.8/kube-linter-linux.tar.gz\ntar xzf kube-linter-linux.tar.gz\nsudo mv kube-linter /usr/local/bin/\n\n# Install OPA\ncurl -L -o opa https://openpolicyagent.org/downloads/latest/opa_linux_amd64\nchmod 755 opa\nsudo mv opa /usr/local/bin/\n\n# Install Semgrep\npip install semgrep\n```\n\n## Setup\n\n### API Keys Configuration\n\nSet your API keys as environment variables:\n\n```bash\n# OpenAI API Key (required for OpenAI models)\nexport OPENAI_API_KEY=\"your-openai-api-key\"\n\n# For persistent configuration\necho 'export OPENAI_API_KEY=\"your-key\"' >> ~/.bashrc\nsource ~/.bashrc\n```\n\n## Usage\n\n### With Prime CLI (Recommended)\n\nThe easiest way to evaluate models on this environment is using the Prime CLI:\n\n```bash\n# Install the environment\nprime env install intertwine/sv-env-config-verification\n\n# Run evaluation with default dataset (combined K8s + Terraform)\nprime env eval sv-env-config-verification \\\n  -a '{\"dataset_name\":\"intertwine-ai/security-verifiers-e2\",\"max_examples\":50}'\n\n# Run with specific dataset (K8s only)\nprime env eval sv-env-config-verification \\\n  -a '{\"dataset_name\":\"k8s-labeled-v1.jsonl\",\"max_examples\":20}' \\\n  --num-examples 10\n\n# Use synthetic dataset for quick testing (no external dependencies)\nprime env eval sv-env-config-verification \\\n  -a '{\"dataset_source\":\"synthetic\",\"include_tools\":true}' \\\n  --num-examples 3\n```\n\n**Note**: By default, Prime uses meta-llama/llama-3.1-70b-instruct. Specify a different model with `--model`:\n\n```bash\nprime env eval sv-env-config-verification \\\n  -a '{\"dataset_name\":\"intertwine-ai/security-verifiers-e2\"}' \\\n  --model gpt-4o \\\n  --num-examples 10\n```\n\n### With Verifiers Library\n\n```python\nimport verifiers as vf\n\n# Load the environment with tools enabled\nenv = vf.load_environment(\"intertwine/sv-env-config-verification\", include_tools=True)\n\n# Evaluate a model\nresults = env.evaluate(\n    client=vf.OpenAIClient(),\n    model=\"gpt-5-mini\",\n    num_examples=10\n)\n\nprint(f\"Average reward: {results.stats['mean_reward']:.2%}\")\nprint(f\"Detection F1: {results.stats.get('detection_f1', 0):.2%}\")\n```\n\n### Quick Evaluation\n\nUse the verifiers CLI:\n\n```bash\n# Basic evaluation\nvf-eval intertwine/sv-env-config-verification \\\n  --model gpt-5-mini \\\n  --num-examples 10\n\n# Evaluation without tools (model must detect issues directly)\nvf-eval intertwine/sv-env-config-verification \\\n  --model gpt-5-mini \\\n  --num-examples 10 \\\n  --include-tools false\n```\n\n### Training with Prime RL\n\n```toml\n[environment]\nid = \"intertwine/sv-env-config-verification\"\nkwargs = {include_tools = true}\n```\n\n## Task Details\n\n### Input Format\n\nInfrastructure configuration files (Kubernetes YAML or Terraform HCL):\n\n```yaml\napiVersion: apps/v1\nkind: Deployment\nmetadata:\n  name: webapp\nspec:\n  containers:\n    - name: app\n      image: myapp:latest\n      securityContext:\n        runAsUser: 0 # Security violation: running as root\n```\n\n### Expected Output\n\nJSON object with detected violations and optional patches:\n\n```json\n{\n  \"violations\": [\n    {\n      \"type\": \"RunAsRoot\",\n      \"severity\": \"HIGH\",\n      \"location\": \"spec.containers[0].securityContext\",\n      \"description\": \"Container runs as root user\"\n    }\n  ],\n  \"patch\": \"spec:\\n  containers:\\n  - securityContext:\\n      runAsUser: 1000\",\n  \"confidence\": 0.95\n}\n```\n\n### Available Tools\n\nWhen `include_tools=True`, the model has access to:\n\n1. **run_kubelinter**: Analyze Kubernetes manifests for security issues\n2. **run_opa**: Policy-based security validation\n3. **run_semgrep**: Static analysis for configuration vulnerabilities\n\n### Scoring\n\nThe environment uses a multi-component rubric:\n\n- **Detection Accuracy**: Precision, recall, and F1 for finding violations\n- **Severity Weighting**: Higher rewards for catching critical issues\n- **Patch Success**: Bonus for generating fixes that resolve violations\n- **Format Compliance**: Valid JSON schema adherence\n\n## Weights & Biases Logging\n\nThis environment supports automatic Weave tracing:\n\n```python\nimport weave\nimport verifiers as vf\n\n# Initialize Weave\nweave.init(project=\"config-security\")\n\n# Load and evaluate\nenv = vf.load_environment(\"intertwine/sv-env-config-verification\", include_tools=True)\nresults = env.evaluate(\n    client=vf.OpenAIClient(),\n    model=\"gpt-5-mini\",\n    num_examples=50\n)\n\n# Results automatically traced to W&B\n```\n\nConfigure via environment variables:\n\n- `WEAVE_PROJECT`: Set project name\n- `WEAVE_DISABLED`: Set to 'true' to disable logging\n- `WANDB_API_KEY`: Your W&B API key\n\n## Evaluation Approach\n\n### Metrics Tracked\n\n- **Detection Precision**: Correct violation identification rate\n- **Detection Recall**: Coverage of actual violations\n- **Detection F1**: Harmonic mean of precision and recall\n- **Severity Accuracy**: Proper severity classification\n- **Patch Success Rate**: Percentage of successful fixes\n- **Tool Utilization**: Effective use of available security scanners\n\n### Example Evaluation Script\n\n```python\nimport verifiers as vf\nimport weave\n\nweave.init(project=\"config-audit-eval\")\n\nenv = vf.load_environment(\"intertwine/sv-env-config-verification\", include_tools=True)\n\n# Compare with and without tools\nfor use_tools in [True, False]:\n    results = env.evaluate(\n        client=vf.OpenAIClient(),\n        model=\"gpt-5-mini\",\n        num_examples=100,\n        include_tools=use_tools,\n        seed=42\n    )\n\n    mode = \"with tools\" if use_tools else \"without tools\"\n    print(f\"\\nResults {mode}:\")\n    print(f\"  Mean Reward: {results.stats['mean_reward']:.2%}\")\n    print(f\"  Detection F1: {results.stats.get('detection_f1', 0):.2%}\")\n    print(f\"  Patch Success: {results.stats.get('patch_success', 0):.2%}\")\n```\n\n## Early Failure Detection\n\nAll E2 evaluation scripts support early stopping to prevent wasted API costs on misconfigured models or API issues:\n\n```bash\n# Multi-turn evaluation (default: stop after 3 consecutive errors)\npython scripts/eval_config_verification.py \\\n  --models \"gpt-5-mini\" \\\n  --num-examples 100 \\\n  --max-consecutive-errors 3\n\n# Disable early stopping (process all examples regardless of errors)\npython scripts/eval_config_verification.py \\\n  --models \"experimental-model\" \\\n  --num-examples 50 \\\n  --max-consecutive-errors 0\n\n# Single-turn evaluation with custom threshold\npython scripts/eval_config_verification_singleturn.py \\\n  --models \"gpt-5-mini\" \\\n  --num-examples 100 \\\n  --max-consecutive-errors 5\n```\n\n**Via Makefile:**\n\n```bash\n# Use default threshold (3 errors)\nmake eval-e2 MODELS=\"gpt-5-mini\" N=100\n\n# Custom threshold\nmake eval-e2 MODELS=\"gpt-5-mini\" N=100 MAX_CONSECUTIVE_ERRORS=5\n\n# Disable early stopping\nmake eval-e2 MODELS=\"test-model\" N=50 MAX_CONSECUTIVE_ERRORS=0\n```\n\nThe early stopping system tracks consecutive API/completion errors and halts evaluation when the threshold is reached, saving time and costs. Tool execution failures are not counted toward the error threshold - only API-level errors trigger early stopping.\n\n## Performance Benchmarks\n\n| Model       | Detection F1 | Patch Success | With Tools | Overall |\n| ----------- | ------------ | ------------- | ---------- | ------- |\n| GPT-4o-mini | 72%          | 45%           | Yes        | 68%     |\n| GPT-4o-mini | 51%          | 28%           | No         | 44%     |\n\n## Dataset\n\nThe environment includes:\n\n- **Kubernetes manifests**: Deployments, Services, ConfigMaps with security issues\n- **Terraform configurations**: AWS, GCP, Azure resources with misconfigurations\n- **Oracle labels**: Ground-truth violations from tool outputs for validation\n\n## Future Improvements\n\n- **Expanded Tool Suite**: Add Checkov, Terrascan, and Trivy\n- **Custom Policies**: Support for organization-specific security rules\n- **Multi-file Analysis**: Cross-file dependency and security analysis\n- **Incremental Patching**: Iterative refinement of fixes based on re-scanning\n- **Compliance Frameworks**: Map violations to CIS, NIST, PCI-DSS standards\n- **Explanation Generation**: Require detailed rationale for each violation\n\n## Requirements\n\n- Python 3.12+\n- `verifiers>=0.1.4`\n- Security scanning tools (kube-linter, opa, semgrep)\n- API key for model inference\n\n## About\n\nThis environment is part of the Open Security Verifiers suite - a collection of security and alignment RL environments using Prime Intellect's Verifiers framework. Each environment provides executable, programmatic rewards for training robust security-aware AI systems.\n\n## Support\n\nFor issues or questions:\n\n- Report issues on the [Prime Intellect Environments Hub](https://app.primeintellect.ai/dashboard/environments)\n- Check the [Security Verifiers GitHub repository](https://github.com/intertwine/security-verifiers)\n- Contact the Intertwine team\n","encoding":"utf-8","truncated":false,"total_bytes":12340},"status":null}