Skip to main content

A lightweight tool for detecting and querying NVIDIA GPU architectures (SM/CC), and generating `-gencode` flags for CUDA builds

Project description

GitHub release Wheels Explore Test Build Publish

cover

nvidia-arch removes the guesswork and keeps your CUDA builds future‑proof and reproducible. ”

✅ All info has been verified by the Explorer 🤖


A lightweight tool for detecting and querying NVIDIA GPU architectures (SM/CC), and generating -gencode flags for CUDA builds; ideal for integrating into Python setup.py and custom CUDA workflows.

If you just want to see my note, see README.md.

💡 Why this exists

Working with CUDA toolchains is notoriously inconsistent across systems, CUDA versions, and GPU families. Different machines report different supported architectures, nvcc behaves differently depending on the installed CTK (CUDA Toolkit), and build scripts often end up hard‑coding SM versions that quickly become outdated.

This package solves that by providing:

  • A single reliable source of truth for supported SM and compute capabilities
  • Automatic detection of architectures from the installed CUDA Toolkit
  • Clean overrides for building against specific CUDA versions
  • Correct and reproducible generation of -gencode flags
  • Consistent behavior across Linux, Windows, WSL, and CI environments

Key features:

  • Detect installed CUDA Toolkit (CTK) and its include/lib paths
  • Query supported SM/CC architectures for any CUDA version
  • Generate correct -gencode flags for nvcc
  • Handle PTX emission cleanly (+PTX suffix or highest‑SM policy)
  • Filter architectures by GPU family (consumer, workstation, Jetson)
  • Provide PyTorch‑style CC strings (7.5;8.6;8.9+PTX)
  • Work reliably across heterogeneous environments (local, Docker, CI)

💽 Installation

Install from PyPI:

PyPI version Downloads total Downloads monthly

pip install nvidia-arch

Install from GitHub repo:

pip install git+https://github.com/rathaROG/nvidia-arch.git

🧪 Usage

For all details of all available functions: see core.py and arches.py.

Main highlights

Print a summary of supported architectures for each CUDA version

from nvidia_arch import print_summary
print_summary(min_sm=30)
CUDA  Arch (min..max)   Consumer/Workstation (cons)                Jetson (jets)
====================================================================================
11.0  3.0..8.0          3.0;3.5;5.0;5.2;6.0;6.1;7.0;7.5            3.2;5.3;6.2;7.2
11.1  3.5..8.6          3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6            5.3;6.2;7.2
11.2  3.5..8.6          3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6            5.3;6.2;7.2
11.3  3.5..8.6          3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6            5.3;6.2;7.2
11.4  3.5..8.7          3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6            5.3;6.2;7.2;8.7
11.5  3.5..8.7          3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6            5.3;6.2;7.2;8.7
11.6  3.5..8.7          3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6            5.3;6.2;7.2;8.7
11.7  3.5..8.7          3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6            5.3;6.2;7.2;8.7
11.8  3.5..9.0          3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9        5.3;6.2;7.2;8.7
12.0  5.0..9.0          5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9            5.3;6.2;7.2;8.7
12.1  5.0..9.0          5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9            5.3;6.2;7.2;8.7
12.2  5.0..9.0          5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9            5.3;6.2;7.2;8.7
12.3  5.0..9.0          5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9            5.3;6.2;7.2;8.7
12.4  5.0..9.0          5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9            5.3;6.2;7.2;8.7
12.5  5.0..9.0          5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9            5.3;6.2;7.2;8.7
12.6  5.0..9.0          5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9            5.3;6.2;7.2;8.7
12.8  5.0..12.0         5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9;12.0       5.3;6.2;7.2;8.7
12.9  5.0..12.1         5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9;12.0;12.1  5.3;6.2;7.2;8.7
13.0  7.5..12.1         7.5;8.6;8.9;12.0;12.1                      8.7;11.0
13.1  7.5..12.1         7.5;8.6;8.9;12.0;12.1                      8.7;11.0
13.2  7.5..12.1         7.5;8.6;8.9;12.0;12.1                      8.7;11.0
====================================================================================

* All NVIDIA Architectures:
  3.0;3.2;3.5;3.7;5.0;5.2;5.3;6.0;6.1;6.2;7.0;7.2;7.5;8.0;8.6;8.7;8.8;8.9;9.0;10.0;10.1;10.3;11.0;12.0;12.1

* Other Notes:
  1. Architecture(s) 8.8 is not officially supported in CUDA 11.8–12.9.
  2. Architecture(s) 10.1 is not officially supported in CUDA 13.0–13.2.
  3. Architecture(s) 10.3 is not officially supported in CUDA 12.8–12.8.
  4. Architecture(s) 11.0 is not officially supported in CUDA 12.8–12.9.

Detect CTK (CUDA Toolkit) in your environment

import json
from nvidia_arch import detect_ctk

cuda_info = detect_ctk()
print(json.dumps(cuda_info, indent=2))
{
  "version": "13.0",
  "root": "/usr/local/cuda",
  "include": {
    "root": "/usr/local/cuda/include",
    "cuda": "/usr/local/cuda/include/cccl/cuda",
    "cub": "/usr/local/cuda/include/cccl/cub",
    "thrust": "/usr/local/cuda/include/cccl/thrust"
  },
  "lib": "/usr/local/cuda/lib64"
}

Find all NVIDIA GPU(s) installed

import json
from nvidia_arch import find_gpu

gpu_info = find_gpu(extra_query_gpu='serial,temperature.gpu')
print(json.dumps(gpu_info, indent=2))
[
  {
    "name": "NVIDIA RTX A6000",
    "compute_cap": "8.6",
    "memory.total": "49140",
    "serial": "1234567891011",
    "temperature.gpu": "44"
  },
  {
    "name": "NVIDIA RTX A6000",
    "compute_cap": "8.6",
    "memory.total": "49140",
    "serial": "1234567891012",
    "temperature.gpu": "39"
  }
]

Get compute cap of the GPU(s) installed

from nvidia_arch import get_compute_cap
get_compute_cap(return_mode='cc_string', add_ptx=True)
'8.6;8.9+PTX'

Get supported SM architectures from installed CTK (CUDA Toolkit)

from nvidia_arch import get_architectures
get_architectures(cuda_ver=None, min_sm=75)
['75', '80', ...]

Get architectures for a specific CTK (CUDA Toolkit) version

from nvidia_arch import get_architectures
get_architectures(cuda_ver="13.0", min_sm=75)
['75', '80', '86', '87', '88', '89', '90', '100', '103', '110', '120', '121']

Get architectures and filter by GPU type (Consumer, Jetson, etc.)

Supported inputs for gpu_type:

  • "all": All supported GPUs
  • "cons": Only consumer/workstation GPUs
  • "jets": Only Jetson/embedded GPUs
  • "cons+jets": Only consumer/workstation + Jetson/embedded GPUs
from nvidia_arch import get_architectures
get_architectures(gpu_type="cons", cuda_ver="13.0", min_sm=75)
['75', '86', '89', '120', '121']

Get compute capabilities instead of SM codes

from nvidia_arch import get_architectures
get_architectures(gpu_type="cons", cuda_ver="13.0", min_sm=75, return_mode="cc_list")
['7.5', '8.6', '8.9', '12.0', '12.1']

Get PyTorch‑style architectures string with PTX

from nvidia_arch import get_architectures
get_architectures(gpu_type="cons+jets", cuda_ver="13.0", min_sm=75, return_mode="cc_string", add_ptx=True)
'7.5;8.6;8.7;8.9;11.0;12.0;12.1+PTX'

Validate a PyTorch‑style architectures string

from nvidia_arch import validate_cc_string
validate_cc_string(
    "6.1+PTX;Pascal;12.0;Lovelace",
    named_arches={"Pascal": "6.0;6.1+PTX", "Lovelace": "8.9+PTX"},
    force_highest_ptx=True,
    against_cuda_ver="12.8"
)
'6.0;6.1;8.9;12.0+PTX'
from nvidia_arch import validate_cc_string
validate_cc_string(
    "6.1+PTX;Pascal;12.0;Lovelace;13.5;0.9",
    named_arches={"Pascal": "6.0;6.1+PTX", "Lovelace": "8.9+PTX"},
    force_highest_ptx=True,
    against_cuda_ver="13.2"
)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\dev\exc\python\p311\Lib\site-packages\nvidia_arch\core.py", line 483, in validate_cc_string
    raise ValueError(f"Unknown architecture(s): {', '.join(unknown_arch)}. ")
ValueError: Unknown architecture(s): 0.9, 13.5+PTX.

Generate nvcc -gencode flags in Setup.py

from nvidia_arch import get_architectures, make_gencode_flags
arches = get_architectures(gpu_type="jets", cuda_ver="13.0", min_sm=75)
make_gencode_flags(arches, add_ptx=True)
# extra_compile_args["nvcc"] += make_gencode_flags(arches)
['-gencode=arch=compute_87,code=sm_87', '-gencode=arch=compute_110,code=[sm_110,compute_110]']

See a real example in BEVFusionx.

📝 License

LICENSE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nvidia_arch-5.0.0.tar.gz (24.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nvidia_arch-5.0.0-py3-none-any.whl (18.9 kB view details)

Uploaded Python 3

File details

Details for the file nvidia_arch-5.0.0.tar.gz.

File metadata

  • Download URL: nvidia_arch-5.0.0.tar.gz
  • Upload date:
  • Size: 24.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nvidia_arch-5.0.0.tar.gz
Algorithm Hash digest
SHA256 62d595f5e702962563b41b91e6e005986af6a2494655ae304f41f6d50424eb94
MD5 904763922170f80046ea5190fa77d5f9
BLAKE2b-256 b4733d5e02680a03f86b7c5b70bfe9f9825520cd06178ff89d01e3626880f015

See more details on using hashes here.

Provenance

The following attestation bundles were made for nvidia_arch-5.0.0.tar.gz:

Publisher: publish.yaml on rathaROG/nvidia-arch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nvidia_arch-5.0.0-py3-none-any.whl.

File metadata

  • Download URL: nvidia_arch-5.0.0-py3-none-any.whl
  • Upload date:
  • Size: 18.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nvidia_arch-5.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 90320089fac2e5daa364c392ffa56cc37ec6f01f1ddfb44f1100f99985d05043
MD5 88e1c86b2afe5fc0e3425d7e15809067
BLAKE2b-256 a4adfcb54501cea22b93adf5f182212893146dd81397722cc12803a469384178

See more details on using hashes here.

Provenance

The following attestation bundles were made for nvidia_arch-5.0.0-py3-none-any.whl:

Publisher: publish.yaml on rathaROG/nvidia-arch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page