{"data":{"kind":"file","path":"README.md","version_id":"jg7r0oo4vmb6ye2buzi8bzem","entry":{"name":"README.md","path":"README.md","is_directory":false,"size":2467,"modified_at":"2025-12-18T01:24:33.543000","content_hash":"2f0333f6d117ed70e057fe84c086183065e38bde03520520ac46648ef2af7029"},"entries":[],"content":"# starvector\n\nThis benchmark is based on [Starvector](https://starvector.github.io/starvector/) and the follow-up work [Rendering-Aware Reinforcement Learning for Vector Graphics Generation](https://starvector.github.io/rlrf/)\n\n`libcairo2` and `libcairo2-dev` are required to run this environment.\n\n### Overview\n- **Environment ID**: `starvector`\n- **Short description**: Image-to-SVG generation benchmark\n- **Tags**: image-to-svg, svg-generation, multimodal, eval\n\n### Datasets\n- **Primary dataset(s)**: [svg-stack](https://huggingface.co/datasets/starvector/svg-stack)\n- **Split sizes**: 5.7k test split\n\n### Task\n- **Type**: `vf.SingleTurnEnv`\n- **Parser**: `vf.Parser` with custom SVG-extraction function.\n- **Rubric overview**: `vf.Rubric` with a set of metrics measuring pixel-level reconstruction fidelity (MSE, SSIM), high-level visual fidelity (DINO, LPIPS), and vector code efficiency.\n\n\n### Quickstart\nRun an evaluation with default settings:\n\n```bash\nuv run vf-eval starvector\n```\n\nConfigure model and sampling:\n\n```bash\nuv run vf-eval starvector   -m gpt-4.1-mini   -n 20 -r 3 -t 1024 -T 0.7   -a '{\"key\": \"value\"}'  # env-specific args as JSON\n```\n\nNotes:\n- Use `-a` / `--env-args` to pass environment-specific configuration as a JSON object.\n\n### Environment Arguments\n\n| Arg | Type | Default | Description |\n| --- | ---- | ------- | ----------- |\n| `dataset_name` | str | `\"starvector/svg-stack\"` | Dataset id from Hugging Face |\n| `split` | str | `\"test\"` | Dataset split |\n| `svg_key` | str | `\"Svg\"` | Key for SVG data in dataset |\n| `num_examples` | int | `-1` | Number of examples to use (use -1 for all). Useful for avoiding pre-processing/rasterizing the whole dataset for a quick eval. |\n| `dino_model` | str | `\"facebook/dinov2-base\"` | DINO model name |\n| `seed` | Optional[int] | `None` | Random seed for shuffling the dataset. It doesn't shuffle if `None`. |\n\n### Metrics\n\n| Metric | Meaning |\n| ------ | ------- |\n| `reward` | Main scalar reward (same as `dino_metric`) |\n| `dino_metric` | Semantic alignment score based on DINOv2 image features. |\n| `lpips_metric` | Perceptual distance score measuring visual distortion (lower is better). |\n| `mse_metric` | Mean Squared Error; measures exact pixel-level reconstruction accuracy. |\n| `ssim_metric` | Structural Similarity Index; measures preservation of texture, contrast, and structure. |\n| `code_efficiency` | Measure of SVG compactness (token length) relative to the ground truth. |\n","encoding":"utf-8","truncated":false,"total_bytes":2467},"status":null}