{"data":{"kind":"file","path":"README.md","version_id":"qgq93b7ddvaejwcib8v74nqg","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":4637,"modified_at":"2025-08-25T03:25:01.596000","content_hash":"821ce188935b164b8ac527860981b98c47a5e29a84f0a368df98413e81243e0b"},"entries":[],"content":"# Q Programming Language Environment\n\nA verifiers environment for evaluating and training models on Q programming language problems with real code execution and test case validation.\n\n### Overview\n- **Environment ID**: `q-programming-language`\n- **Short description**: Evaluate and train models on Q programming language problems with real code execution and test case validation\n- **Tags**: programming, q-language, code-generation, test-execution, finance\n\n### About Q Programming Language\n\nQ is a vector programming language developed by Kx Systems, designed for high-performance data analysis and financial modeling. It features:\n\n- **Concise, functional syntax** for rapid development\n- **Built-in vector operations** for efficient data processing\n- **Fast execution** on large datasets\n- **Wide adoption** in quantitative finance and time-series analysis\n\nQ is particularly valuable for financial modeling, real-time analytics, and time-series processing, but is underrepresented in general-purpose AI training data compared to mainstream languages like Python, Java, and C++.\n\nMore details can be found in the full technical report here: [Technical Report: Full-Stack Fine-Tuning for the Q Programming Language](https://arxiv.org/abs/2508.06813)\n\n### Datasets\n- **Primary dataset**: [SFT Python-Q Programming Problems](https://huggingface.co/datasets/morganstanley/sft-python-q-problems)\n- **Source**: LeetCode-style algorithmic problems with solutions in both Python and Q\n- **Split sizes**: 542 train / 136 test problems\n- **Features**: 678 unique programming problems with comprehensive test cases\n\n### Task\n- **Type**: Single-turn code generation\n- **Parser**: ThinkParser with custom Q code extraction (handles `<answer>` tags and ````q` code blocks)\n- **Rubric overview**: Test case execution reward (percentage passed + perfect bonus)\n\n### Quickstart\n\n**Prerequisites:**\n1. Install Q language interpreter from [code.kx.com/q/](https://code.kx.com/q/) (community license available)\n2. Set environment variable: `export Q_EXECUTABLE_PATH=\"/path/to/your/q/executable\"`\n\nRun an evaluation with default settings:\n\n```bash\nuv run vf-eval q-programming-language\n```\n\nConfigure model and sampling:\n\n```bash\nuv run vf-eval q-programming-language \\\n  -m gpt-4.1-mini \\\n  -n 20 -r 3 -t 1024 -T 0.7 \\\n  -a '{\"num_train_examples\": 50, \"perfect_bonus\": 1.0}'\n```\n\n### Environment Arguments\n\n| Arg | Type | Default | Description |\n| --- | ---- | ------- | ----------- |\n| `use_think` | bool | `true` | Enable reasoning format with `<answer>` tags |\n| `num_train_examples` | int | `-1` | Limit training examples (use -1 for all) |\n| `num_eval_examples` | int | `-1` | Limit evaluation examples (use -1 for all) |\n| `perfect_bonus` | float | `1.0` | Bonus reward for passing all test cases |\n\n### Metrics\n\n| Metric | Meaning |\n| ------ | ------- |\n| `reward` | Main scalar reward (test case pass rate + perfect bonus) |\n| `test_case_pass_rate` | Percentage of test cases passed |\n| `perfect_solutions` | Number of solutions passing all test cases |\n\n### How It Works\n\n1. **Problem Loading**: Loads Q programming problems from the Morgan Stanley dataset\n2. **Code Generation**: Models generate Q code solutions (with optional reasoning)\n3. **Code Extraction**: Parser extracts Q code from completions (handles reasoning format and code blocks)\n4. **Test Execution**: Generated code is executed against test cases using the Q interpreter\n5. **Reward Calculation**: Reward based on percentage of test cases passed + bonus for perfect solutions\n\n### Research Context\n\nThis environment is based on the comprehensive Q language training work described in:\n- **Paper**: [Technical Report: Full-Stack Fine-Tuning for the Q Programming Language](https://arxiv.org/abs/2508.06813)\n- **Dataset**: [SFT Python-Q Programming Problems](https://huggingface.co/datasets/morganstanley/sft-python-q-problems)\n\nThe research demonstrates that specialized fine-tuning can significantly improve model performance on niche programming languages, with the best model achieving 59% pass@1 accuracy on Q problems, surpassing Claude Opus-4 by 29.5%.\n\n### Setup Requirements\n\n1. **Q Language Interpreter**: Download from [code.kx.com/q/](https://code.kx.com/q/) (free community license available)\n2. **Environment Variable**: Set `Q_EXECUTABLE_PATH` to point to your Q executable\n3. **Dataset Access**: The environment automatically downloads the dataset from Hugging Face\n\nExample setup:\n```bash\n# Download and install Q from code.kx.com/q/\nexport Q_EXECUTABLE_PATH=\"/path/to/your/q/executable\"\n\n# Test the environment\nuv run vf-eval q-programming-language -n 5\n```\n\n","encoding":"utf-8","truncated":false,"total_bytes":4637},"status":null}