{"data":{"kind":"file","path":"README.md","version_id":"h7ll4v6y8xyozq7var4qne02","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":3136,"modified_at":"2025-11-29T16:11:42.068000","content_hash":"024dc6926a53acd5c15a95da964e167cd6bfb6a990dc80b16d206ce47bf1c2a0"},"entries":[],"content":"# VCBench: Benchmarking LLMs in Venture Capital\n\nA synthetic benchmark for evaluating large language models (LLMs) on founder success prediction in venture capital.\n\n## Overview\n\nVCBench is inspired by the research paper [\"VCBench: Benchmarking LLMs in Venture Capital\"](https://arxiv.org/pdf/2509.14448). This environment provides:\n\n- **Synthetic Dataset Generation**: Creates realistic founder profiles with education, work experience, and success labels\n- **Standardized Metrics**: Precision, Recall, and F₀.₅ score (emphasizing precision)\n- **Binary Classification Task**: Predict whether a founder will be successful (company achieves exit or IPO valued at over $500M)\n\n## Installation\n\n```bash\nprime env install <your-username>/vcbench\n```\n\nOr install locally for development:\n\n```bash\ncd environments/vcbench\nuv pip install -e .\n```\n\n## Usage\n\n```python\nimport verifiers as vf\nfrom vcbench import load_environment\n\n# Load the environment\nenv = load_environment(\n    num_profiles=9000,\n    target_success_rate=0.09,\n    seed=42\n)\n\n# Or load from a pre-generated dataset\nenv = load_environment(dataset_path=\"vcbench_dataset.json\")\n\n# Evaluate a model\nresults = vf.evaluate(env, model=\"your-model\")\n```\n\n## Configuration\n\nThe `load_environment` function accepts the following parameters:\n\n- `num_profiles` (int, default=9000): Number of profiles to generate\n- `target_success_rate` (float, default=0.09): Target success rate (9% default)\n- `seed` (int, default=42): Random seed for reproducibility\n- `dataset_path` (str, optional): Path to pre-generated dataset JSON file\n\n## Evaluation Metrics\n\n- **Precision**: Percentage of predicted successful founders that are actually successful\n- **Recall**: Percentage of actually successful founders that were predicted correctly\n- **F₀.₅ Score**: Harmonic mean emphasizing precision (formula: `(1.25 * precision * recall) / (0.25 * precision + recall)`)\n- **Accuracy**: Overall classification accuracy\n\nThe F₀.₅ score is used as the primary ranking metric, as it emphasizes precision over recall, which is important for VC applications where false positives are costly.\n\n## Dataset Format\n\nEach test case contains a founder profile with:\n- Industry\n- Education history (degrees, fields, institutions, years)\n- Work experience (titles, companies, industries, dates)\n- Success label (ground truth)\n\nThe model must respond with \"YES\" or \"NO\" to indicate whether the founder will be successful.\n\n## Citation\n\nIf you use VCBench in your research, please cite the original paper:\n\n```bibtex\n@article{chen2025vcbench,\n  title={VCBench: Benchmarking LLMs in Venture Capital},\n  author={Chen, Rick and Ternasky, Joseph and Kwesi, Afriyie Samuel and others},\n  journal={arXiv preprint arXiv:2509.14448},\n  year={2025}\n}\n```\n\n## Notes\n\n- This is a **synthetic** benchmark using generated data, not real founder profiles\n- The dataset generation uses probabilistic rules to create realistic-looking profiles\n- Success labels are determined based on features that correlate with founder success\n- For production use, consider using real anonymized data following the paper's methodology\n\n\n","encoding":"utf-8","truncated":false,"total_bytes":3136},"status":null}