Skip to main content

GPU package manager — find prebuilt CUDA wheels, build missing ones, generate uv pyproject.toml

Project description

gpkg

PyPI Python License: MIT

GPU package manager. Stop compiling CUDA extensions.

demo

pip install gpkg
gpkg add flash-attn causal-conv1d mamba-ssm
# done. resolved, locked, installed.

The problem

These packages take 20-60 minutes to compile and regularly fail:

Package Typical compile time Common failure
flash-attn 25 min OOM during build, CUDA mismatch
flash-attn-3 20 min SM90+ only, rare wheel coverage
causal-conv1d 10 min Torch ABI mismatch
mamba-ssm 15 min Cascading causal-conv1d failure
natten 30 min CUTLASS dependency, arch-specific
sageattention 10 min Windows build nightmare
grouped-gemm 10 min MoE stack dependency

Prebuilt wheels exist across dozens of GitHub repos and pip indexes. Finding the right wheel for your exact python + torch + cuda + platform combo is a scavenger hunt nobody should repeat.

Install

pip install gpkg

# or with uv
uv tool install gpkg

# or from source
git clone https://github.com/Mapika/gpkg && cd gpkg
uv tool install .

Usage

Just add packages

# Auto-detect torch + cuda, resolve wheels, lock, install
gpkg add flash-attn causal-conv1d mamba-ssm

# If no prebuilt wheel exists, it builds from source (optimized)
gpkg add flash-attn causal-conv1d mamba-ssm  # --build-missing is automatic

Resolve without installing

# Explicit versions
gpkg --torch 2.11.0 --cuda 130 flash-attn flash-attn-3 -o pyproject.toml

# Auto-detect torch and CUDA from your environment
gpkg flash-attn causal-conv1d mamba-ssm

Lockfile for reproducible installs

# Write a lockfile with exact versions and URLs
gpkg --torch 2.11.0 --cuda 130 flash-attn causal-conv1d --lock

# Install from lockfile (no network needed)
gpkg install -o pyproject.toml

# Install + sync in one step
gpkg install --sync

Verify your environment

gpkg test flash-attn causal-conv1d mamba-ssm
  ok  torch 2.11.0+cu128  GPU: NVIDIA GeForce RTX 5070 Ti
  ok  flash-attn 2.8.3
  ok  causal-conv1d 1.6.1
  ok  mamba-ssm 2.3.1

See what's available

gpkg --available causal-conv1d natten
causal-conv1d -- available wheels (linux_x86_64)
┏━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ CUDA ┃ PyTorch versions        ┃
┡━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 12   │ 2.10, 2.9, 2.8, 2.7    │
│ 13   │ 2.10, 2.9              │
└──────┴────────────────────────┘

Diagnostics

gpkg --explain --torch 2.11 --cuda 130 flash-attn   # why a wheel was/wasn't selected
gpkg --doctor --torch 2.11 --cuda 130 flash-attn     # verify URLs are accessible

Cache management

gpkg --cache-info                    # show cache statistics
gpkg --cache-clean                   # clean all cached data
gpkg --cache-clean --older-than 1h   # clean entries older than 1 hour

When there's no wheel: compile fast

When gpkg add can't find a prebuilt wheel, it automatically builds from source with optimized settings:

  • Detects your GPU arch via nvidia-smi → builds for only that arch
  • Uses all CPU cores (capped by RAM to prevent OOM)
  • Enables ninja for parallel builds
  • Caches the built wheel so you never compile the same package twice

You can also use build-env.sh manually:

source build-env.sh
uv add causal-conv1d    # 5-10x faster than default

How it works

  1. Checks the hosted registry at wheels.mapika.dev for cached wheels (fast)
  2. Falls back to GitHub releases API and pip find-links indexes
  3. Matches wheels against your torch + cuda + python + platform
  4. Picks the best match per package (latest version, prefers non-manylinux)
  5. If no wheel exists and --build-missing is set, compiles from source
  6. Emits a valid pyproject.toml with [tool.uv.sources] pointing at direct URLs

Registry

The registry tracks 9 sources across 7 packages:

Package Sources
flash-attn mjun0812/flash-attention-prebuild-wheels, Dao-AILab/flash-attention
flash-attn-3 mjun0812/flash-attention-prebuild-wheels
causal-conv1d Dao-AILab/causal-conv1d
mamba-ssm state-spaces/mamba
natten SHI-Labs/NATTEN
grouped-gemm fanshiqing/grouped_gemm
sageattention woct0rdho/SageAttention, mobcat40/sageattention-blackwell

Adding a source

Edit src/gpkg/registry.toml and open a PR:

[[sources]]
package      = "causal-conv1d"
description  = "causal-conv1d -- your torch 2.11 builds"
type         = "github"
repo         = "yourname/causal-conv1d-wheels"
wheel_name   = "causal_conv1d-{version}+cu{cuda}torch{torch}-{pytag}-{platform}.whl"
cuda_style   = "full"
scan_tags    = 5

Configuration

Env var Purpose
GITHUB_TOKEN Raise API rate limit 60 to 5000 req/hr
UVFORGE_TOKEN Bearer token for private registries
UVFORGE_TOKEN_<HOST> Host-specific token (e.g. UVFORGE_TOKEN_WHEELS_MYCO_COM)
UVFORGE_REGISTRY Override default registry path/URL

Private registries also support ~/.netrc for credential storage.

CI Usage

- name: Install GPU packages
  env:
    GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
  run: |
    pip install gpkg
    gpkg add flash-attn causal-conv1d mamba-ssm

For machine-readable output:

gpkg --list --json                           # all registered sources
gpkg --available flash-attn --json           # cuda/torch combos
gpkg --torch 2.10 --cuda 128 flash-attn --json  # resolved wheel URLs

Development

git clone https://github.com/Mapika/gpkg && cd gpkg
uv sync
uv run pytest -v
uv run ruff check src/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpkg-0.4.3.tar.gz (1.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gpkg-0.4.3-py3-none-any.whl (31.2 kB view details)

Uploaded Python 3

File details

Details for the file gpkg-0.4.3.tar.gz.

File metadata

  • Download URL: gpkg-0.4.3.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.17

File hashes

Hashes for gpkg-0.4.3.tar.gz
Algorithm Hash digest
SHA256 f28a5733acfe98c8747bf310af1da373d7a97cdd59142eebde0145f431627fbe
MD5 e7b315161e1b3a4d859f4ddf2cd473bf
BLAKE2b-256 dfd275a4c29d0a000c59019e272ead5d2444d9b0e56036cacfe63d5f8de2494a

See more details on using hashes here.

File details

Details for the file gpkg-0.4.3-py3-none-any.whl.

File metadata

  • Download URL: gpkg-0.4.3-py3-none-any.whl
  • Upload date:
  • Size: 31.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.17

File hashes

Hashes for gpkg-0.4.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f575f3d5cce8e6a198c6b478a158f51cf879e7cc208d37aa6840d108e9920a79
MD5 932e42a86504c92041d6eaba9853ee51
BLAKE2b-256 ebb2d449d2a593a9f2c96021b46386c2606777c14eeee6154dddd9b6a4d6cb41

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page