A lightweight tool for detecting and querying NVIDIA GPU architectures (SM/CC), and generating `-gencode` flags for CUDA builds
Project description
“ nvidia-arch removes the guesswork and keeps your CUDA builds future‑proof and reproducible. ”
✅ All info has been verified by the Explorer 🤖
A lightweight tool for detecting and querying NVIDIA GPU architectures (SM/CC), and generating -gencode flags for CUDA builds; ideal for integrating into Python setup.py and custom CUDA workflows.
If you just want to see my note, see README.md.
💡 Why this exists
Working with CUDA toolchains is notoriously inconsistent across systems, CUDA versions, and GPU families. Different machines report different supported architectures, nvcc behaves differently depending on the installed CTK (CUDA Toolkit), and build scripts often end up hard‑coding SM versions that quickly become outdated.
This package solves that by providing:
- A single reliable source of truth for supported SM and compute capabilities
- Automatic detection of architectures from the installed CUDA Toolkit
- Clean overrides for building against specific CUDA versions
- Correct and reproducible generation of
-gencodeflags - Consistent behavior across Linux, Windows, WSL, and CI environments
Key features:
- Detect installed CUDA Toolkit (CTK) and its include/lib paths
- Query supported SM/CC architectures for any CUDA version
- Generate correct
-gencodeflags for nvcc - Handle PTX emission cleanly (
+PTXsuffix or highest‑SM policy) - Filter architectures by GPU family (consumer, workstation, Jetson)
- Provide PyTorch‑style CC strings (
7.5;8.6;8.9+PTX) - Work reliably across heterogeneous environments (local, Docker, CI)
💽 Installation
Install from PyPI:
pip install nvidia-arch
Install from GitHub repo:
pip install git+https://github.com/rathaROG/nvidia-arch.git
🧪 Usage
For all details of all available functions: see core.py and arches.py.
Main highlights
Print a summary of supported architectures for each CUDA version
from nvidia_arch import print_summary
print_summary(min_sm=30)
CUDA Arch (min..max) Consumer/Workstation (cons) Jetson (jets)
=========================================================================================
11.0 3.0..8.0 3.0;3.5;5.0;5.2;6.0;6.1;7.0;7.5 3.2;5.3;6.2;7.2
11.1 3.5..8.6 3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6 5.3;6.2;7.2
11.2 3.5..8.6 3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6 5.3;6.2;7.2
11.3 3.5..8.6 3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6 5.3;6.2;7.2
11.4 3.5..8.7 3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6 5.3;6.2;7.2;8.7
11.5 3.5..8.7 3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6 5.3;6.2;7.2;8.7
11.6 3.5..8.7 3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6 5.3;6.2;7.2;8.7
11.7 3.5..8.7 3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6 5.3;6.2;7.2;8.7
11.8 3.5..9.0 3.5;5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9 5.3;6.2;7.2;8.7
12.0 5.0..9.0 5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9 5.3;6.2;7.2;8.7
12.1 5.0..9.0 5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9 5.3;6.2;7.2;8.7
12.2 5.0..9.0 5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9 5.3;6.2;7.2;8.7
12.3 5.0..9.0 5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9 5.3;6.2;7.2;8.7
12.4 5.0..9.0 5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9 5.3;6.2;7.2;8.7
12.5 5.0..9.0 5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9 5.3;6.2;7.2;8.7
12.6 5.0..9.0 5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9 5.3;6.2;7.2;8.7
12.8 5.0..12.0 5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9;12.0 5.3;6.2;7.2;8.7;10.1
12.9 5.0..12.1 5.0;5.2;6.0;6.1;7.0;7.5;8.6;8.9;12.0;12.1 5.3;6.2;7.2;8.7;10.1
13.0 7.5..12.1 7.5;8.6;8.9;12.0;12.1 8.7;11.0
13.1 7.5..12.1 7.5;8.6;8.9;12.0;12.1 8.7;11.0
13.2 7.5..12.1 7.5;8.6;8.9;12.0;12.1 8.7;11.0
=========================================================================================
* All NVIDIA Architectures:
3.0;3.2;3.5;3.7;5.0;5.2;5.3;6.0;6.1;6.2;7.0;7.2;7.5;8.0;8.6;8.7;8.8;8.9;9.0;10.0;10.1;10.3;11.0;12.0;12.1
* Other Notes:
1. Architecture(s) 8.8 is not officially supported in CUDA 11.8–12.9.
2. Architecture(s) 10.1 is not officially supported in CUDA 13.0–13.2.
3. Architecture(s) 10.3 is not officially supported in CUDA 12.8–12.8.
4. Architecture(s) 11.0 is not officially supported in CUDA 12.8–12.9.
Detect CTK (CUDA Toolkit) in your environment
import json
from nvidia_arch import detect_ctk
cuda_info = detect_ctk()
print(json.dumps(cuda_info, indent=2))
{
"version": "13.0",
"root": "/usr/local/cuda",
"include": {
"root": "/usr/local/cuda/include",
"cuda": "/usr/local/cuda/include/cccl/cuda",
"cub": "/usr/local/cuda/include/cccl/cub",
"thrust": "/usr/local/cuda/include/cccl/thrust"
},
"lib": "/usr/local/cuda/lib64"
}
Find all NVIDIA GPU(s) installed
import json
from nvidia_arch import find_gpus
gpu_info = find_gpus(extra_query_gpu='serial,temperature.gpu')
print(json.dumps(gpu_info, indent=2))
[
{
"name": "NVIDIA RTX A6000",
"compute_cap": "8.6",
"memory.total": "49140",
"serial": "1234567891011",
"temperature.gpu": "44"
},
{
"name": "NVIDIA RTX A6000",
"compute_cap": "8.6",
"memory.total": "49140",
"serial": "1234567891012",
"temperature.gpu": "39"
}
]
Get compute cap of the GPU(s) installed
from nvidia_arch import get_compute_caps
get_compute_caps(return_mode='cc_string', add_ptx=True)
'8.6;8.9+PTX'
Get supported SM architectures from installed CTK (CUDA Toolkit)
from nvidia_arch import get_arches
get_arches(cuda_ver=None, min_sm=75)
['75', '80', ...]
Get architectures for a specific CTK (CUDA Toolkit) version
from nvidia_arch import get_arches
get_arches(cuda_ver="13.0", min_sm=75)
['75', '80', '86', '87', '88', '89', '90', '100', '103', '110', '120', '121']
Get architectures and filter by GPU type (Consumer, Jetson, etc.)
Supported inputs for gpu_type:
"all": All supported GPUs"cons": Only consumer/workstation GPUs"jets": Only Jetson/embedded GPUs"dcen": Only datacenter GPUs"cons+jets": Only consumer/workstation + Jetson/embedded GPUs
from nvidia_arch import get_arches
get_arches(gpu_type="cons", cuda_ver="13.0", min_sm=75)
['75', '86', '89', '120', '121']
Get compute capabilities instead of SM codes
from nvidia_arch import get_arches
get_arches(gpu_type="cons", cuda_ver="13.0", min_sm=75, return_mode="cc_list")
['7.5', '8.6', '8.9', '12.0', '12.1']
Get PyTorch‑style architectures string with PTX
from nvidia_arch import get_arches
get_arches(gpu_type="cons+jets", cuda_ver="13.0", min_sm=75, return_mode="cc_string", add_ptx=True)
'7.5;8.6;8.7;8.9;11.0;12.0;12.1+PTX'
Normalize architectures
from nvidia_arch import normalize_arches, get_arches
normalize_arches(['75', '86', '89+PTX'], return_mode='cc_string')
normalize_arches('7.5;8.6;8.9+PTX', exclude='8.6', return_mode='cc_string')
arches = get_arches(cuda_ver=12.8, return_mode='cc_string', add_ptx=True)
normalize_arches(arches, exclude='10.1', return_mode='cc_string')
'7.5;8.6;8.9+PTX'
'7.5;8.9+PTX'
'5.0;5.2;5.3;6.0;6.1;6.2;7.0;7.2;7.5;8.0;8.6;8.7;8.9;9.0;10.0;12.0+PTX'
Validate a PyTorch‑style architectures string
from nvidia_arch import validate_arch_string
validate_arch_string(
"6.1+PTX;Pascal;12.0;Lovelace",
named_arches={"Pascal": "6.0;6.1+PTX", "Lovelace": "8.9+PTX"},
force_highest_ptx=True,
against_cuda_ver="12.8"
)
'6.0;6.1;8.9;12.0+PTX'
from nvidia_arch import validate_arch_string
validate_arch_string(
"6.1+PTX;Pascal;12.0;Lovelace;13.5;0.9",
named_arches={"Pascal": "6.0;6.1+PTX", "Lovelace": "8.9+PTX"},
force_highest_ptx=True,
against_cuda_ver="13.2"
)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\dev\exc\python\p311\Lib\site-packages\nvidia_arch\core.py", line 483, in validate_arch_string
raise ValueError(f"Unknown architecture(s): {', '.join(unknown_arch)}. ")
ValueError: Unknown architecture(s): 0.9, 13.5+PTX.
Generate nvcc -gencode flags in Setup.py
from nvidia_arch import get_arches, make_gencode_flags
arches = get_arches(gpu_type="jets", cuda_ver="13.0", min_sm=75)
make_gencode_flags(arches, add_ptx=True)
# extra_compile_args["nvcc"] += make_gencode_flags(arches)
['-gencode=arch=compute_87,code=sm_87', '-gencode=arch=compute_110,code=[sm_110,compute_110]']
See a real example in BEVFusionx.
Deprecation
The following legacy function names are deprecated and will be removed in version 10.0.0:
find_gpu()→ usefind_gpus()insteadget_compute_cap()→ useget_compute_caps()insteadget_architectures()→ useget_arches()insteadvalidate_cc_string()→ usevalidate_arch_string()instead
You can continue using the old names until v10.0.0, but all new code and documentation now use the new, more Pythonic API. Importing or calling any deprecated function will issue a DeprecationWarning.
📝 License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nvidia_arch-7.0.0.tar.gz.
File metadata
- Download URL: nvidia_arch-7.0.0.tar.gz
- Upload date:
- Size: 27.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7077ee0353687e580d4d7e1e026300e80599ebe10b35db90d7ebc7460855fd8b
|
|
| MD5 |
41d5afedde8166c08852be34a8215245
|
|
| BLAKE2b-256 |
e37c2e4fd20c9934742c82f0dbc821213382c05d7aba10fb24f9d54bca612993
|
Provenance
The following attestation bundles were made for nvidia_arch-7.0.0.tar.gz:
Publisher:
publish.yaml on rathaROG/nvidia-arch
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nvidia_arch-7.0.0.tar.gz -
Subject digest:
7077ee0353687e580d4d7e1e026300e80599ebe10b35db90d7ebc7460855fd8b - Sigstore transparency entry: 1182836969
- Sigstore integration time:
-
Permalink:
rathaROG/nvidia-arch@9ac73bd5e1c88a79ecc0efd1bf253fc2f1e79c59 -
Branch / Tag:
refs/tags/v7.0.0 - Owner: https://github.com/rathaROG
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@9ac73bd5e1c88a79ecc0efd1bf253fc2f1e79c59 -
Trigger Event:
push
-
Statement type:
File details
Details for the file nvidia_arch-7.0.0-py3-none-any.whl.
File metadata
- Download URL: nvidia_arch-7.0.0-py3-none-any.whl
- Upload date:
- Size: 20.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c226fa32434389473d00fa9a1d14cac74830fae5ebf6bcbe01e138c53621da66
|
|
| MD5 |
7619e32b36ff83e1e7260cfbfc991e9b
|
|
| BLAKE2b-256 |
c225057d33166c0a8ebb4e3d8e04e88fd6d0ae972ab6fe75a333664fd44ca25a
|
Provenance
The following attestation bundles were made for nvidia_arch-7.0.0-py3-none-any.whl:
Publisher:
publish.yaml on rathaROG/nvidia-arch
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nvidia_arch-7.0.0-py3-none-any.whl -
Subject digest:
c226fa32434389473d00fa9a1d14cac74830fae5ebf6bcbe01e138c53621da66 - Sigstore transparency entry: 1182836988
- Sigstore integration time:
-
Permalink:
rathaROG/nvidia-arch@9ac73bd5e1c88a79ecc0efd1bf253fc2f1e79c59 -
Branch / Tag:
refs/tags/v7.0.0 - Owner: https://github.com/rathaROG
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@9ac73bd5e1c88a79ecc0efd1bf253fc2f1e79c59 -
Trigger Event:
push
-
Statement type: