LLM model advisor for NVIDIA Jetson and DGX Spark unified-memory devices

These details have not been verified by PyPI

Project links

Project description

jetfit

LLM model advisor for NVIDIA Jetson and DGX Spark unified-memory devices.

Detects your Jetson hardware, scores LLM models across quality, speed, and memory fit, and tells you exactly which quantization level will run well on your device.

Ships with an interactive TUI (default) and a CLI mode. Supports hardware simulation, calibration, compare view, and plan mode.

Install

pip install jetfit

or with uv:

uv tool install jetfit   # install globally
uvx jetfit               # run without installing

Usage

TUI (default)

jetfit

Launches the interactive terminal UI. The top bar shows your detected platform, available RAM, accelerator type, and minimum JetPack version. Models are listed in a scrollable table sorted by params, with composite score, estimated tok/s, best quantization, memory %, and fit grade per row.

Normal mode

Key	Action
`j` / `k`	Navigate models
`g`	Jump to top / bottom (toggle)
`Enter`	Open detail view
`p`	Open plan mode
`m`	Mark / unmark model for compare
`c`	Open compare view (marked vs selected)
`x`	Clear all marks
`v`	Enter visual select mode
`/`	Focus search bar
`r`	Cycle provider (family) filter
`b`	Cycle size filter
`f`	Cycle fit filter
`s`	Cycle sort column
`-`	Flip sort direction
`F`	Open advanced filter popup
`S`	Open hardware simulation
`A`	Open advanced config (tune efficiency)
`t`	Cycle theme
`h`	Open help
`q`	Quit

Visual mode (`v`)

Select a contiguous range of models for bulk comparison.

Key	Action
`j` / `k`	Extend selection
`m`	Mark selected model
`c`	Open compare view for selection
`v` / `Esc`	Exit visual mode

Detail view (`Enter`)

Shows full quant ladder for the selected model — size, KV cache, total memory, memory %, estimated tok/s, and fit grade for every quantization level. Navigate rows with j/k; the left panel updates to show specs for the highlighted quant.

Plan mode (`p`)

Estimates hardware requirements for a model config. Edit Context, Quant, and Target TPS fields. Shows minimum and recommended RAM, feasibility per run path, and upgrade deltas.

Key	Action
`Tab` / `j` / `k`	Move between fields
Type	Edit current field
`Backspace`	Remove characters
`Esc` / `q`	Exit plan mode

Compare view (`c`)

Side-by-side comparison of marked models. Rows are attributes (Score, tok/s, Fit, Mem%, Params, Quant, Context); columns are models. Best values are highlighted.

Hardware simulation (`S`)

Override the active hardware profile to preview recommendations for any supported Jetson or DGX Spark device without leaving the TUI. The system bar shows (sim) when active.

Advanced config (`A`)

Tune the efficiency factor used for tok/s estimation. Changes apply immediately and all scores are recalculated.

Advanced filter (`F`)

Set numeric bounds on parameter count and memory utilization %.

CLI

# Detect hardware
jetfit system

# Detect hardware (JSON)
jetfit system --json

# Recommend models for current hardware
jetfit recommend

# Filter by model name
jetfit recommend --model llama

# Fix a specific quant level
jetfit recommend --quant Q4_K_M

# Show all quant levels per model
jetfit recommend --all-quants

# Override available memory
jetfit recommend --available-gb 12.0

# Target a specific hardware profile
jetfit recommend --profile jetson_agx_orin_64gb

# Minimum tok/s threshold
jetfit recommend --min-tps 5.0

# JSON output
jetfit recommend --json

Supported Hardware

Device	RAM	Bandwidth	Accelerator	JetPack
Jetson Nano	4 GB	25.6 GB/s	DLA+CUDA	4.x
Jetson TX2 NX	4 GB	51.2 GB/s	CUDA	5.x
Jetson TX2 4GB	4 GB	51.2 GB/s	CUDA	4.x
Jetson TX2	8 GB	59.7 GB/s	CUDA	4.x
Jetson TX2i	8 GB	51.2 GB/s	CUDA	4.x
Jetson Xavier NX 8GB	8 GB	59.7 GB/s	DLA+CUDA	5.x
Jetson Xavier NX 16GB	16 GB	59.7 GB/s	DLA+CUDA	5.x
Jetson AGX Xavier 16GB	16 GB	136.5 GB/s	DLA+CUDA	5.x
Jetson AGX Xavier 32GB	32 GB	136.5 GB/s	DLA+CUDA	5.x
Jetson AGX Xavier 64GB	64 GB	136.5 GB/s	DLA+CUDA	5.x
Jetson AGX Xavier Industrial	64 GB	136.5 GB/s	DLA+CUDA	5.x
Jetson Orin Nano 4GB	4 GB	51.2 GB/s	CUDA	6.x
Jetson Orin Nano 8GB	8 GB	102.4 GB/s	CUDA	6.x
Jetson Orin NX 8GB	8 GB	102.4 GB/s	DLA+CUDA	6.x
Jetson Orin NX 16GB	16 GB	102.4 GB/s	DLA+CUDA	6.x
Jetson AGX Orin 32GB	32 GB	204.8 GB/s	DLA+CUDA	6.x
Jetson AGX Orin 64GB	64 GB	204.8 GB/s	DLA+CUDA	6.x
Jetson AGX Orin Industrial	64 GB	204.8 GB/s	DLA+CUDA	6.x
Jetson AGX Thor T4000	64 GB	273 GB/s	FP4+CUDA	6.x
Jetson AGX Thor T5000	128 GB	273 GB/s	FP4+CUDA	6.x
DGX Spark (GB10)	128 GB	273 GB/s	FP4+CUDA	—

On macOS or Linux dev machines, jetfit runs in simulation mode — pick any profile with S to preview recommendations.

How it works

Hardware detection — Reads device-tree model and compatible strings (/proc/device-tree/), tegra release (/etc/nv_tegra_release), and available RAM via tegrastats, jtop, or /proc/meminfo (priority order). On non-Jetson machines, falls back to simulation mode with a selectable profile.
Model database — 67 models embedded directly in fit.py. Each entry has a parameter count and real context length sourced from HuggingFace. Memory requirements are computed across a 6-level quantization ladder (Q8_0 through Q2_K) using per-quant bytes-per-parameter values that account for k-quant codebook overhead.
KV cache accounting — Memory estimates include a fp16 KV cache (0.000008 × params_b × 4096 GB) and 0.5 GB runtime overhead, so "fits" means the model will actually load at a typical 4K inference context.
FP4 halving — On devices with FP4 support (Thor, DGX Spark), effective model size is halved before all memory and speed calculations.
Fit levels — Based on (weights + KV cache + overhead) / available_memory:

Level Utilization

Perfect ≤ 70%

Good 71–90%

Marginal 91–100%

TooTight > 100%
Speed estimation — Token generation is memory-bandwidth-bound. Estimated tok/s:

(bandwidth_GB_s / effective_size_GB) × efficiency × quant_speed_multiplier

Default efficiency is 0.50–0.55 per profile, tunable via A. Quant multipliers range from 1.00× (Q8_0) to 1.80× (Q2_K).
Composite score — Each model gets a 0–100 score combining normalized speed (45%), fit level (35%), and quantization quality (20%). Used for sorting and the score column.
Calibration — A per-profile efficiency factor can be saved to ~/.config/jetfit/calibration.json. When present, the system bar shows a ✓ cal badge and all speed estimates use the measured value instead of the profile default.

Level	Utilization
Perfect	≤ 70%
Good	71–90%
Marginal	91–100%
TooTight	> 100%

Project structure

jetfit/
  __init__.py      -- version
  cli.py           -- Click CLI entry point, TUI launch
  hardware.py      -- Jetson/DGX hardware detection
  profiles.py      -- Hardware profile database (22 devices)
  fit.py           -- Scoring engine, quantization ladder, model catalog
  tui.py           -- Textual TUI (app state, rendering, keyboard events)
tests/
  test_hardware.py -- Hardware detection and TUI markup regression tests
  test_fit.py      -- Scoring engine unit tests
  test_calibration.py
  test_ros2.py
pyproject.toml
LICENSE

Dependencies

Package	Purpose
`click`	CLI argument parsing
`rich`	CLI table and colored output
`textual`	Terminal UI framework

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.4

May 22, 2026

0.1.3

May 22, 2026

0.1.2

May 22, 2026

This version

0.1.1

May 22, 2026

0.1.0

May 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jetfit-0.1.1.tar.gz (36.7 kB view details)

Uploaded May 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

jetfit-0.1.1-py3-none-any.whl (31.7 kB view details)

Uploaded May 22, 2026 Python 3

File details

Details for the file jetfit-0.1.1.tar.gz.

File metadata

Download URL: jetfit-0.1.1.tar.gz
Upload date: May 22, 2026
Size: 36.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for jetfit-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`6277c595b883c4e187c06eafc24d182b5005c61c8db61800c8dcde5419acf89c`
MD5	`5e18e23e1f34da7d863a1435b3de39c9`
BLAKE2b-256	`61c68fa68478e7b0805fd2a8823b2b7308408da41ab7e8a5c39c2741eff8b68b`

See more details on using hashes here.

File details

Details for the file jetfit-0.1.1-py3-none-any.whl.

File metadata

Download URL: jetfit-0.1.1-py3-none-any.whl
Upload date: May 22, 2026
Size: 31.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for jetfit-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9a5c12ec61fb2b3e8bdeae7932ae1de85d26ce53ac3f312879c0bf7870fc1544`
MD5	`fc7546ca85b724d8437c17d8a56783e9`
BLAKE2b-256	`fc5bdb7ff893a21396c655ce1532e0fc6e95d08dfb48661534074a1230e449db`

See more details on using hashes here.

jetfit 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

jetfit

Install

Usage

TUI (default)

Normal mode

Visual mode (v)

Detail view (Enter)

Plan mode (p)

Compare view (c)

Hardware simulation (S)

Advanced config (A)

Advanced filter (F)

CLI

Supported Hardware

How it works

Project structure

Dependencies

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Visual mode (`v`)

Detail view (`Enter`)

Plan mode (`p`)

Compare view (`c`)

Hardware simulation (`S`)

Advanced config (`A`)

Advanced filter (`F`)