Skip to main content

NVIDIA Profier tools

Project description

Tools to help working with nvprof SQLite files, specifically for profiling scripts to train deep learning models. The files can be big and thus slow to scp and work with in NVVP. This tool is aimed in extracting the small bits of important information and make profiling in NVVP faster.

You can remove a big number of unimportant events and take a small time slice, so that you can shrink the sqlite database a few MBs.

sliced nvprof in NVVP

sliced nvprof in NVVP

Installing

Install package nvprof - for just using it:

$ pip install nvprof

…or for development:

$ pip install -e .

Features

$ nvprof_tools --help
usage: nvprof_tools [-h] {info,truncate,slice} ...

NVIDIA Profiler tools

positional arguments:
  {info,truncate,slice}

optional arguments:
  -h, --help            show this help message and exit
$ nvprof_tools slice --help
usage: nvprof_tools slice [-h] [-s START] [-e END] db_file

positional arguments:
  db_file

optional arguments:
  -h, --help            show this help message and exit
  -s START, --start START
                        start time (sec)
  -e END, --end END     end time (sec)

Summary about the file

It can show:

  • total time (can be used to decide which time slice to take in nvvp)

  • number of events in the tables sorted from highest

  • compute utilization percentage

  • number of GPUs

$ nvprof_tools info foo.sqlite
Number of GPUs: 1
Compute utilization: 10.07 %
Total time: 6.659 sec
Total number of events: 516874
Events by table:
CUPTI_ACTIVITY_KIND_RUNTIME : 348080
CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL : 63792
CUPTI_ACTIVITY_KIND_DRIVER : 48279
CUPTI_ACTIVITY_KIND_SYNCHRONIZATION : 19741
CUPTI_ACTIVITY_KIND_CUDA_EVENT : 17860
CUPTI_ACTIVITY_KIND_MEMCPY : 15974
CUPTI_ACTIVITY_KIND_MEMSET : 2816
CUPTI_ACTIVITY_KIND_OVERHEAD : 309
CUPTI_ACTIVITY_KIND_STREAM : 12
CUPTI_ACTIVITY_KIND_DEVICE_ATTRIBUTE : 8
CUPTI_ACTIVITY_KIND_NAME : 1
CUPTI_ACTIVITY_KIND_CONTEXT : 1
CUPTI_ACTIVITY_KIND_DEVICE : 1

In case of multiple GPUs compute utilization is calculated for each device:

Number of GPUs: 4
Compute utilization (mean): 43.04 %
  GPU 0: 42.86 %
  GPU 1: 42.34 %
  GPU 2: 43.42 %
  GPU 3: 43.55 %
Total time: 35.041 sec
Total number of events: 5670557

Remove unnecessary events

Typically 80% of the events are runtime/driver CUDA calls, which are not essential for profiling deep learning scripts. Let’s remove them.

NOTE: It will overwrite the input file.

$ nvprof_tools truncate foo.sqlite

Eg. we shrinked a database from 29 MB to 8 MB.

Slice only a small time range

# keep only events between 5 and 6 seconds
$ nvprof_tools slice foo.sqlite -s 5.0 -e 6.0

More information

More information

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nvprof-0.2.tar.gz (4.9 kB view details)

Uploaded Source

Built Distribution

nvprof-0.2-py2.py3-none-any.whl (7.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file nvprof-0.2.tar.gz.

File metadata

  • Download URL: nvprof-0.2.tar.gz
  • Upload date:
  • Size: 4.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for nvprof-0.2.tar.gz
Algorithm Hash digest
SHA256 6db38cbe1a5ce6d7a0926f1b5c2092b1bf30bb4053446e99fba4d308ff4adbf9
MD5 2f546a8e3fc79e8bc39864826e8692d2
BLAKE2b-256 83635b6abfe4db6ce3f0eef5f9fb7f36acb7a7796be0b7662a5232ef00b74a54

See more details on using hashes here.

File details

Details for the file nvprof-0.2-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for nvprof-0.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 7636d883287359c1a390cb4d3d99e504c98369f3560c782dda31ef9af8f70023
MD5 bba745e4d6a58433732aa0372a3bd2c0
BLAKE2b-256 fe936e82240f973ab93fa17c1d2de1a70f22d63dee29ae0fda557d8611d2daf1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page