Skip to main content

Vectorizes raster images (PNG/JPG) using a mix of LLMs and NSGA-II multi-objective optimization. Outputs SVG and other vector formats.

Project description

vectrify

PyPI Python License

While LLMs are powerful they still struggle to generate perfect vector images from reference raster images in one shot. That is where vectrify can help. It turns raster images into editable vector code by treating vectorization as a search problem: an LLM proposes candidate SVG/Graphviz/Typst code, a vision scorer ranks how close each candidate looks to the source, and an optimization algorithm iteratively refines the best candidates.

The results are quite good and produces human-readable code.

Features

Three output formats are supported out of the box: SVG (default), Graphviz DOT, and Typst (HTML and TikZ are planned). API keys for OpenAI, Anthropic, and Google Gemini are auto-detected from environment variables. Two search strategies are available: NSGA-II for diversity-preserving multi-objective optimization that weighs in complexity, and beam search for a budget-friendly singular solution. Perceptual scoring uses a local vision model with embeddings, with a pixel-level fallback or LLM-as-judge as alternatives. Runs are resumable, so you can pick up where you left off or fork from the top-N nodes of a previous run. A live dashboard shows pool stats, scoring, and convergence criteria.

Install

The recommended way to install a CLI tool is pipx or uv tool, both of which put vectrify in its own isolated environment and on your PATH:

pipx install vectrify           # or: uv tool install vectrify

Plain pip works too, but it installs into whatever Python environment is active. With pip install --user, make sure ~/.local/bin is on your PATH.

The base install includes SVG output and the simple pixel-difference scorer. For everything else, pick the extras you need:

Extra What it adds
vision torch + transformers for the perceptual (CLIP/SigLIP) scorer
graphviz the graphviz Python bindings (system Graphviz still required)
typst the typst Python compiler
all vision + graphviz + typst
pipx install "vectrify[vision]"          # recommended for best quality
pipx install "vectrify[all]"             # everything

System dependencies: SVG output needs Cairo (apt install libcairo2 or brew install cairo), and --format graphviz additionally needs the Graphviz binaries (apt install graphviz or brew install graphviz). A CUDA-capable GPU is optional; the vision scorer falls back to CPU/MPS.

Provider setup

Set exactly one of the following environment variables:

export OPENAI_API_KEY=...
export ANTHROPIC_API_KEY=...
export GEMINI_API_KEY=...

Override with --provider {openai,anthropic,gemini} if you have multiple keys set.

Quickstart

vectrify input.png -o output.svg

That's it. The defaults run up to 5 NSGA-II epochs and stop early once the search stops finding new improvements (see Convergence below). Worst case it runs for an hour and gives up.

A few useful variations:

# Bigger budget, longer runs
vectrify photo.jpg -o sketch.svg --epoch-patience 60 --max-wall-seconds 1800

# Steer the search with a goal
vectrify logo.png --goal "Use thick strokes only and avoid gradients"

# Output Graphviz DOT instead of SVG
vectrify diagram.png -o out.dot --format graphviz

# Resume from a previous run, keeping only the 20 best nodes
vectrify input.png --resume --resume-top 20

Run vectrify --help for the full flag reference, organized into LLM provider, scoring, search strategy, epoch control, resume, output artifacts, and runtime sections.

How it works

vectrify runs an evolutionary loop over a pool of candidate vector representations. The pool is seeded with a few LLM-generated candidates, then on each iteration a parent is sampled from the pool. With probability 1 − --llm-rate the parent is mutated locally (color tweaks, path nudges, crossover); otherwise the LLM is called to produce a refined edit. The new candidate is scored against the source image (perceptual via vision transformer embeddings, pixel-space, or LLM-as-judge) and either replaces a worse pool member or is dropped.

Two search strategies decide how the pool is managed and how parents are picked. The default NSGA-II strategy uses non-dominated sorting and crowding distance, which keeps diverse Pareto-optimal candidates around and shines when you have time for multiple epochs. Beam search instead runs --beams parallel hill-climbers, with --cull-keep controlling how aggressively low-ranked beams are pruned, and converges faster on a single good answer. NSGA-only flags are --epoch-diversity, --epoch-variance, and --epoch-seeds; beam-only flags are --beams and --cull-keep. The CLI rejects mixed usage.

NSGA-II minimizes two normalized objectives in parallel: visual error (scorer distance to the source) and content complexity (code size / token cost). The variant used here is constraint-first (Deb 2000): only candidates whose visual error is in the top 25% of the active pool are considered feasible and compete on the Pareto frontier of (error, complexity); everything else is automatically dominated. In practice that means visual quality is the primary objective; complexity acts as a tiebreaker among the quality-leaders, biasing the search toward small, clean renderings instead of accreting detail forever once the image is already close.

Convergence

Each epoch ends as soon as one of these triggers fires; the next epoch re-seeds from the current Pareto front. The search stops once --max-epochs is reached, --max-wall-seconds runs out, or the global --max-llm-calls cap (if set) is hit.

Flag Default Triggers when…
--max-epochs 4 hard cap on epoch count
--epoch-patience 20 this many LLM calls in a row produce no improvement
--epoch-steps 50 this many LLM calls have run in the current epoch
--epoch-variance 0 (NSGA-only) score std-dev in the active pool drops below value
--epoch-diversity 0 (NSGA-only) mean pairwise genome diversity drops below value
--max-wall-seconds 3600 global wall-clock budget; ends the run, not just the epoch
--max-llm-calls 0 global hard cap on total LLM calls; 0 disables

Most tasks are cheap local mutations (controlled by --llm-rate, default 10% LLM). They run constantly and only rarely produce a new best score, so counting every task toward patience would burn it through in seconds. Patience and step counters therefore tick only on LLM-driven exploration tasks, which is what you actually pay for and what drives meaningful progress. A new best from any source, LLM or local, still resets the patience counter. Set --epoch-variance and --epoch-diversity to non-zero values to add NSGA-specific stop criteria; their right thresholds depend on your scorer and image, so they're off by default.

Bounding the API bill

The defaults give an upper bound on LLM calls per run, computed as:

max LLM calls ≈ max_epochs × epoch_steps + epoch-0 seeds + drain overhead
              = 4 × 50 + ~10 + a few ≈ 220

That's the worst case; typical runs end earlier on --epoch-patience. If you need a strict ceiling, e.g. for cost-sensitive automation, set --max-llm-calls 200 and the engine will halt the run as soon as the counter hits that value, regardless of which epoch it's in.

Each edit call sends three images (target, current render, diff heatmap) plus the current code as input (typically a few thousand tokens), and returns small search/replace diff blocks rather than rewriting the whole file, so output is usually only a few hundred tokens. A full default run is on the order of a US dollar on flagship models. Verify against the OpenAI, Anthropic, or Google AI pricing pages.

Output layout

Given --output sketch.svg, vectrify writes:

sketch.svg                       # the best final candidate
sketch/
└── runs/
    └── 2026-04-26_14-30-21/     # one directory per run, timestamped
        ├── lineage.csv          # accepted node history (score, parent, ops)
        └── nodes/
            ├── 0.0421_0001.svg  # one file per accepted node, prefixed by score
            ├── 0.0421_0001.png  # rendered preview (--save-raster)
            └── ...

Disable artifacts you don't need with --no-write-lineage or --no-save-raster, or enable --save-heatmap to also dump perceptual diff maps next to each node.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectrify-0.1.0.tar.gz (266.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vectrify-0.1.0-py3-none-any.whl (112.0 kB view details)

Uploaded Python 3

File details

Details for the file vectrify-0.1.0.tar.gz.

File metadata

  • Download URL: vectrify-0.1.0.tar.gz
  • Upload date:
  • Size: 266.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for vectrify-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f6891970406e46bf8e882709e0bcb818849c4dff451fcf14c03e5198cb49e614
MD5 2b5f97bdb14703454df6728267ff7d65
BLAKE2b-256 09d30638d7f56f5d6fdf092051f3686fded4dc4ea36e192e4f2b9205b839ec1e

See more details on using hashes here.

File details

Details for the file vectrify-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: vectrify-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 112.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for vectrify-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 481b603490619af77462e12ba6fc66984fc28ca39242a0ae19e9d76a9e63d6d4
MD5 7c368da04cd5f9a15aa906bbdda53588
BLAKE2b-256 b98d08bc2a5a3598cc42877d9f621c29b6705c6636f520ab74925f4b3adec619

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page