Skip to main content

CLI tool for analyzing Nsight Compute kernel output

Project description

profile-kernel

A CLI tool to profile CUDA kernels using NVIDIA Nsight Compute, extract source-level performance information, and automatically highlight performance bottlenecks (e.g. excessive memory access or warp stalls).

Features

  • Parses Nsight Compute CSV output
  • Annotates source lines with Access and Stall warnings
  • Filters to only show lines with specific warnings
  • Exports as a clean, annotated CSV

Installation

pip install profile-kernel

Usage

profile-kernel \
  --ncu_path /path/to/ncu \
  --exe_path /path/to/python \
  --filepath my_kernel.py \
  --output_filename output \
  [--filter access|stall|warning]  # optional

Example Output

The output CSV will contain columns like:

#, Warning Type, Warning Info, Address, SASS, ...

License

MIT License © 2025 Arjun Menon

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

profile_kernel-0.1.2.tar.gz (3.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

profile_kernel-0.1.2-py3-none-any.whl (4.2 kB view details)

Uploaded Python 3

File details

Details for the file profile_kernel-0.1.2.tar.gz.

File metadata

  • Download URL: profile_kernel-0.1.2.tar.gz
  • Upload date:
  • Size: 3.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for profile_kernel-0.1.2.tar.gz
Algorithm Hash digest
SHA256 e28b0874f38bbfce2622109ce935bbe1800492d779222875f6fa63bdcbacb43c
MD5 3ed03ee66717e7a2be41306820762e33
BLAKE2b-256 ecd7afe0181ca5b6fb08383d2d36b00a24f56812ec7b0535ef913a94b1ce6e3c

See more details on using hashes here.

File details

Details for the file profile_kernel-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: profile_kernel-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 4.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for profile_kernel-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 cda676fec46f174b2c114b2cf2fed867a5781fe57e288d3efb39cad5d72426a6
MD5 3e7a44b5c9876e2ae57233c89a5962b2
BLAKE2b-256 65199d167b72e3e00e5d7d0ecf9ce1c604426ef813c71851a670835416dc2584

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page