Single-host daemon that surfaces 'idle-held' NVIDIA GPU memory — the embarrassing category conventional dashboards miss.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ananda9503

These details have not been verified by PyPI

Project description

gpu-usage-audit

A single-host diagnostic daemon that records NVIDIA GPU utilization to SQLite and produces a retrospective report separating active use from allocated-but-idle ("idle-held") and truly idle (no process at all).

Conventional dashboards collapse the latter two. Surfacing idle-held as its own number is the entire point. Someone left a Jupyter notebook open with an 8 GB tensor on the GPU and went to lunch — nvidia-smi will show 1% utilization, but the card is unusable by anyone else. This tool measures that.

Status: v0.4.1 — gua exposes the new auto-runtime command surface as safe placeholders. daemon still runs on a real NVIDIA host, demo runs anywhere (no GPU required), and report reads either. The Go v0.1.0 implementation remains downloadable at tag v0.1.0 / branch go-archive.

Install

The recommended install path is PyPI via uv. The package has no core runtime dependencies.

Requires uv. In normal online environments, uv creates the isolated tool environment and manages the needed Python runtime. If Python downloads are disabled by local policy, install Python 3.12+ first.

uv tool install gpu-usage-audit

gua doctor
gua start --dry-run
gpu-usage-audit demo

In v0.4.1 the gua commands are intentionally read-only placeholders: they print what is not implemented yet and make no system, service, cluster, or database changes. Use gpu-usage-audit daemon/report/demo for the existing compatibility workflow.

Available gua subcommands in v0.4.1: doctor, start, status, report, stop, and uninstall.

GitHub Release assets are also available for manual download:

BASE="https://github.com/AI-Ocean/gpu-usage-audit/releases/download/v0.4.1"
WHEEL="gpu_usage_audit-0.4.1-py3-none-any.whl"

curl -fsSLO "$BASE/$WHEEL"
curl -fsSLO "$BASE/SHA256SUMS"
sha256sum -c SHA256SUMS --ignore-missing

uvx --from "./$WHEEL" gua doctor

What you get

$ gpu-usage-audit report --db /var/lib/gua/gua.db --since 1h
gpu-usage-audit — lab-a100 (bare, driver 560.35.05)  Window: 1:00:00

§1 Headline
  █████████▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░
  active       █   15.7%
  idle-held    ▒   45.1%       ← this is the number conventional tools miss
  truly-idle   ░   39.2%
  (51 samples)

§2 Waste
  ~0.43 GPU-hours idle, ~2.53 GPUs equivalently unused

§3 Per-GPU
  GPU-0     active  47.1%  idle-held  35.3%  truly-idle  17.6%
  GPU-1     active   0.0%  idle-held 100.0%  truly-idle   0.0%
  GPU-2     active   0.0%  idle-held   0.0%  truly-idle 100.0%

§4 Top identities
  identity              gpu-hours   idle-held
  alice                      0.42       42.9%
  bob                        0.28      100.0%

§5 Time-of-day heatmap (UTC)
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3
  Mon               .

The 3-bar collapses every card × every tick over the window into the active / idle-held / truly-idle split. idle-held rows are the embarrassing category: a process is holding GPU memory but the SM utilization is below 10%.

Demo (no GPU required)

The demo subcommand records 30 ticks of fake telemetry and prints the report — all in one process, no second shell needed.

gpu-usage-audit demo

The bundled FakeTier produces a deterministic 5-tick workload — active learning → idle-held memory → cleanup — so the output is the same every run. Adjust the shape with --ticks N and --interval D.

Real NVIDIA GPU host

On an NVIDIA host, install the [nvml] extra and run daemon:

# Add the NVML Python package to the tool environment.
uv tool install --force --with nvidia-ml-py gpu-usage-audit

gpu-usage-audit daemon --db /tmp/gua.db --interval 30s

Run the report from another shell:

gpu-usage-audit report --db /tmp/gua.db --since 1h --interval 30s

The daemon requires the NVIDIA driver and libnvidia-ml.so.1. On a driver-less host it exits with NVML Shared Library Not Found. For a driverless box, use demo instead.

Usage

gpu-usage-audit has three commands sharing one SQLite file:

Command	What it does
`daemon`	Long-running background process. Samples real NVML telemetry on every tick and appends to the database. Stop with Ctrl+C (SIGINT) or `systemctl stop`. NVIDIA host required.
`report`	One-shot read against the accumulated database. Safe to run while the daemon is still writing — SQLite WAL mode handles the concurrency.
`demo`	Self-contained showcase. Records N fake ticks and immediately prints the report. No GPU, no second shell, no operational meaning — just to see the output shape.

`daemon`

gpu-usage-audit daemon --db PATH [--interval D]

--db PATH — SQLite file to write to. Created if missing. WAL mode enabled automatically.
--interval D (default 30s) — how often to sample. Accepts 30s, 1m, 200ms, etc.

Each tick prints a one-line summary to stdout; on shutdown the cumulative row count is printed.

`report`

gpu-usage-audit report --db PATH [--since D] [--interval D] [--width N]

--db PATH — same SQLite file the daemon writes to.
--since D (default 1h) — the report window. No upper bound — --since 365d is accepted. The effective window is min(--since, age of oldest sample), so passing a huge --since is the same as "all data". Units: ms, s, m, h, d (no w; use 7d).
--interval D (default 30s) — must match what the daemon used. This is how §2 (Waste) and §4 (Top identities) convert tick counts to GPU-hours. Mismatched intervals → wrong GPU-hours.
--width N (default 60) — width of the §1 three-bar in characters.

`demo`

gpu-usage-audit demo [--db PATH] [--ticks N] [--interval D]

--db PATH (optional) — if omitted, a fresh temporary database is created and its path is printed to stderr.
--ticks N (default 30) — how many fake ticks to record before printing the report.
--interval D (default 1s) — tick spacing.

Operational notes

Same --interval on both sides. If you ran the daemon with --interval 30s, run report --interval 30s too.
Let it run for a while. §1/§3 are meaningful after one tick; §4 (Top identities) needs hours; §5 (Heatmap) needs days.
WAL leaves sidecar files (gua.db-wal, gua.db-shm). They are cleaned up automatically when the last connection closes.
DB size: ~50 MB per host per 30 days at 12 GPUs (extrapolated from Go v0.1.0; not yet re-measured for the Python rewrite).

Running as a systemd service

For a long-running deployment, drop a unit file in /etc/systemd/system/gpu-usage-audit.service:

[Unit]
Description=gpu-usage-audit daemon
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/gpu-usage-audit daemon --db /var/lib/gua/gua.db --interval 30s
Restart=on-failure
User=gua

[Install]
WantedBy=multi-user.target

Then systemctl enable --now gpu-usage-audit.

How the classification works

Each tick of the daemon records:

per-card: util_pct (SM utilization)
per-process: mem_used_mb per (card, pid)

The report aggregates per card × per tick:

util >= 10                  → active        (compute is happening)
util <  10 AND mem >  100   → idle-held     (memory is held, SM is cold)
util <  10 AND mem <= 100   → truly-idle    (the card is genuinely free)

The 100 MB threshold absorbs the PyTorch/TF runtime baseline so importing torch doesn't count as "holding the GPU".

Development

Requires uv (uv pins the Python version automatically; requires-python = ">=3.12").

git clone https://github.com/AI-Ocean/gpu-usage-audit
cd gpu-usage-audit
uv sync                          # create .venv, install dev deps
uv run pytest                    # run the test suite
uv run ruff check                # lint
uv run mypy                      # type-check (strict)
uv run gpu-usage-audit demo      # see the report shape locally

CI runs ruff + format check + mypy + pytest, then builds and smoke-tests the wheel on every push and PR. Tag pushes (v*) rerun the same checks, build sdist + wheel, smoke-test the wheel, and create a GitHub Release with auto-generated notes. Release tags also publish the wheel and sdist to PyPI through Trusted Publishing.

Non-goals

This is a single-host retrospective tool. Live dashboards, multi-host aggregation, quotas, and pod-name resolution are out of scope — those belong above the host layer. If this tool surfaces enough idle-held to make scheduling worth solving, see ocean-all.

License

Apache License 2.0 — see LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

ananda9503

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.0.2

May 15, 2026

1.0.1

May 15, 2026

1.0.0

May 15, 2026

This version

0.4.1

May 14, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpu_usage_audit-0.4.1.tar.gz (89.8 kB view details)

Uploaded May 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gpu_usage_audit-0.4.1-py3-none-any.whl (39.3 kB view details)

Uploaded May 14, 2026 Python 3

File details

Details for the file gpu_usage_audit-0.4.1.tar.gz.

File metadata

Download URL: gpu_usage_audit-0.4.1.tar.gz
Upload date: May 14, 2026
Size: 89.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gpu_usage_audit-0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`d390e3d6260969775e49f81168e9710deb46efac597ba891245279b54b5653ad`
MD5	`dc95697e201b921a1ed9f67458062a86`
BLAKE2b-256	`d58ff328520771ef667a4f97dedc2c17c59a74b84279703ea9bb84c4e2bc58bf`

See more details on using hashes here.

Provenance

The following attestation bundles were made for gpu_usage_audit-0.4.1.tar.gz:

Publisher: release.yml on AI-Ocean/gpu-usage-audit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: gpu_usage_audit-0.4.1.tar.gz
- Subject digest: d390e3d6260969775e49f81168e9710deb46efac597ba891245279b54b5653ad
- Sigstore transparency entry: 1532351962
- Sigstore integration time: May 14, 2026
Source repository:
- Permalink: AI-Ocean/gpu-usage-audit@140500de4280f38bb3d098c1de56c704c6ca3d1e
- Branch / Tag: refs/tags/v0.4.1
- Owner: https://github.com/AI-Ocean
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@140500de4280f38bb3d098c1de56c704c6ca3d1e
- Trigger Event: push

File details

Details for the file gpu_usage_audit-0.4.1-py3-none-any.whl.

File metadata

Download URL: gpu_usage_audit-0.4.1-py3-none-any.whl
Upload date: May 14, 2026
Size: 39.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gpu_usage_audit-0.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ac7eaa48ed5f90c0be3e14dcb0ac61751e242dcdf906d53eabc9bf09556b8b86`
MD5	`7b68ce538bd761dae64909132490ae27`
BLAKE2b-256	`b93a659380d351aaa693641129fe95f4d517ebba59fb78339ff751aa44b088c9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for gpu_usage_audit-0.4.1-py3-none-any.whl:

Publisher: release.yml on AI-Ocean/gpu-usage-audit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: gpu_usage_audit-0.4.1-py3-none-any.whl
- Subject digest: ac7eaa48ed5f90c0be3e14dcb0ac61751e242dcdf906d53eabc9bf09556b8b86
- Sigstore transparency entry: 1532352047
- Sigstore integration time: May 14, 2026
Source repository:
- Permalink: AI-Ocean/gpu-usage-audit@140500de4280f38bb3d098c1de56c704c6ca3d1e
- Branch / Tag: refs/tags/v0.4.1
- Owner: https://github.com/AI-Ocean
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@140500de4280f38bb3d098c1de56c704c6ca3d1e
- Trigger Event: push

gpu-usage-audit 0.4.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

gpu-usage-audit

Install

What you get

Demo (no GPU required)

Real NVIDIA GPU host

Usage

daemon

report

demo

Operational notes

Running as a systemd service

How the classification works

Development

Non-goals

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`daemon`

`report`

`demo`