Model-agnostic edge deployment analysis framework — PLE memory analysis, TurboQuant/RotorQuant KV compression, mmap profiling, and LoRA fine-tuning for memory-constrained devices.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

smarthi

These details have not been verified by PyPI

Project description

dhurandhar — धुरंधर

dhura (धुर, burden) + dhara (धर, one who bears)

"Bearer of burdens" — a model-agnostic framework for deploying large multimodal models on memory-constrained edge devices where they have no right to survive.

What it does

Given a model architecture and a target device, dhurandhar answers the questions that matter before you ship:

Module	Question answered
PLE Analysis	What is the true peak memory footprint at context length N?
Device Feasibility	Will this model run resident, mmap'd, or not at all on this device?
TurboQuant Sweep	What is the quality/memory tradeoff at 2/3/4/6/8-bit KV compression?
RotorQuant Comparison	TurboQuant vs RotorQuant — quality vs arithmetic cost?
Mmap Profiler	What is the real mmap throughput and peak RSS on this hardware?

All five analyses are exposed as a CLI, a Python API, and a 5-tab Gradio dashboard.

Why this exists

Gemma 4 E2B's "< 1.5 GB RAM" deployment story depends on memory-mapping the Per-Layer Embedding (PLE) table from flash. On the LiteRT-LM E2B checkpoint, PLE is 1.12 GB — larger than the 0.79 GB text decoder. Whether mmap'd PLE sustains acceptable decode throughput on your target silicon is the single highest-risk item in any edge deployment plan.

dhurandhar lets you:

Predict memory feasibility per device profile before hardware arrives
Measure TurboQuant KV cache compression quality against Gemma 4's hybrid-attention architecture (shared KV + GQA + sliding window)
Fine-tune LoRA adapters on the frozen-PLE base model via QLoRA

Installation

# Core (analysis + CLI)
uv add dhurandhar

# With interactive dashboard
uv add "dhurandhar[dashboard]"

# With GPU support (flash-attn, Linux only)
uv add "dhurandhar[gpu]"

Quickstart

PLE memory footprint + device feasibility

dhurandhar-analyze-ple --context-tokens 32768 --quant-bits 4

Component                  Size      Notes
-------------------------  --------  -------------------------
Text decoder weights       809 MB    Q4
PLE embedding table        1,147 MB  Q4
KV cache @ 32,768 tokens   138 MB    shared + GQA + TurboQuant
Vision encoder             150 MB    bf16
Audio encoder (STRIPPED)   0 MB      bf16
...
Total (PLE resident): 2,404 MB
Total (PLE mmap'd):  1,321 MB
PLE/Decoder ratio:   1.42x

[low_tier_mobile_emmc] Low-tier Mobile (eMMC 5.1)
  RAM budget:    1024 MB
  Mode:        infeasible
  Notes:       Insufficient RAM even with mmap. Short by 297 MB.

[laptop_nvme] Laptop (NVMe PCIe 4.0)
  RAM budget:    8192 MB
  Mode:        resident
  Notes:       PLE fits resident with 5788 MB headroom.

TurboQuant KV cache compression

dhurandhar-benchmark-kv --seq-len 32768 --residual-bits 4

Quality (synthetic KV reconstruction):
  Cosine similarity:   0.9972
  Compression ratio:   4.57x vs bf16
  Fresh-KV layers:     24
  Shared-KV layers:    6 (skipped)

Real mmap decode throughput

# Quick run — small test file, ~15s
dhurandhar-profile-mmap --scale 0.1 --num-tokens 1000 --target-tps 15

# Full-fidelity — ~1 GB test file, realistic cold-mmap numbers
dhurandhar-profile-mmap --scale 1.0 --num-tokens 5000 --measure-memory

Codec comparison: TurboQuant vs RotorQuant

dhurandhar-compare-codecs --head-dim 255 --residual-bits 2,3,4,6,8

LoRA fine-tuning

# Dry run — confirm adapter attachment without training
dhurandhar-train-lora --config configs/gemma4_lora.yaml --dry-run

# Real training (requires GPU + HF_TOKEN)
HF_TOKEN=hf_... dhurandhar-train-lora --config configs/gemma4_lora.yaml

Interactive dashboard (5 tabs)

uv sync --extra dashboard
dhurandhar-dashboard
dhurandhar-dashboard --server-name 0.0.0.0 --port 7860  # LAN access

Five tabs:

📊 PLE Memory Analysis — component breakdown + stacked bar chart vs 1.5 GB target
📱 Device Feasibility — resident 🟢 / mmap 🟡 / infeasible 🔴 verdicts + custom device
🗜️ TurboQuant KV — quality sweep across residual bits + memory savings estimate
⚡ Mmap Profiler — real mmap throughput + peak RSS vs deployment budget
🔄 TurboQuant vs RotorQuant — quality sweep + stage-1 arithmetic cost comparison

Custom model architecture

Override architectural constants in config.py or verify against a live checkpoint:

from transformers import AutoConfig
cfg = AutoConfig.from_pretrained("google/gemma-4-E2B")
print(cfg.num_hidden_layers, cfg.hidden_size, cfg.num_key_value_heads)

Custom device profile

Pass your own device spec directly or as a YAML file:

from dhurandhar.config import DeploymentProfile, DEVICE_PROFILES
DEVICE_PROFILES["my_device"] = DeploymentProfile(
    name="My Target Device",
    ram_budget_mb=2048,
    flash_read_gbps=3.5,
    supports_npu=True,
)

Project structure

src/dhurandhar/
├── config.py          # Gemma 4 E2B constants + device profiles
├── ple_analysis.py    # PLE memory math + device feasibility (analytical)
├── mmap_profiler.py   # Real mmap throughput + peak RSS probe (empirical)
├── turboquant.py      # TurboQuant codec (Hadamard + sign + residual)
├── rotorquant.py      # RotorQuant codec (blockwise 3D Clifford rotors)
├── finetune.py        # QLoRA training pipeline + audio-encoder strip
├── dashboard.py       # Gradio 5-tab dashboard
└── cli.py             # Click-based CLI entry points

Testing

uv run pytest                    # all tests, ~15s
uv run pytest tests/test_turboquant.py -v
uv run pytest tests/test_rotorquant.py -v
uv run pytest tests/test_ple_analysis.py -v
uv run pytest tests/test_mmap_profiler.py -v
uv run pytest tests/test_strip_audio.py -v

License

Apache 2.0

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

smarthi

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.2

Apr 28, 2026

This version

0.1.1

Apr 28, 2026

0.1.0

Apr 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dhurandhar-0.1.1.tar.gz (300.2 kB view details)

Uploaded Apr 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dhurandhar-0.1.1-py3-none-any.whl (57.0 kB view details)

Uploaded Apr 28, 2026 Python 3

File details

Details for the file dhurandhar-0.1.1.tar.gz.

File metadata

Download URL: dhurandhar-0.1.1.tar.gz
Upload date: Apr 28, 2026
Size: 300.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dhurandhar-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`c74934c2ff5a5338954f36045f36ad633481b64be1da2a84b8a5bd2f7405563a`
MD5	`a7e9fd15f1ee1aa95e35a3202a7f67c7`
BLAKE2b-256	`00d98b93a9e0af71df3b8e8c28aec026229df10d3b484ef80821e40c438011c0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for dhurandhar-0.1.1.tar.gz:

Publisher: publish.yml on smarthi/dhurandhar

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: dhurandhar-0.1.1.tar.gz
- Subject digest: c74934c2ff5a5338954f36045f36ad633481b64be1da2a84b8a5bd2f7405563a
- Sigstore transparency entry: 1395947685
- Sigstore integration time: Apr 28, 2026
Source repository:
- Permalink: smarthi/dhurandhar@bd9ec93199d97ca6422360f78e9872ff02d44035
- Branch / Tag: refs/tags/v0.1.1
- Owner: https://github.com/smarthi
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@bd9ec93199d97ca6422360f78e9872ff02d44035
- Trigger Event: release

File details

Details for the file dhurandhar-0.1.1-py3-none-any.whl.

File metadata

Download URL: dhurandhar-0.1.1-py3-none-any.whl
Upload date: Apr 28, 2026
Size: 57.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dhurandhar-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d7894a193034c1116df2f054dfcceaf17d3e06d8999de4167b29af918dfae4d1`
MD5	`160c8ebe1dec7de07453da768fc546c5`
BLAKE2b-256	`4a1de6e2a2b86d2be14f2555096d752f7f309d87631a69937e75b5cab3022c22`

See more details on using hashes here.

Provenance

The following attestation bundles were made for dhurandhar-0.1.1-py3-none-any.whl:

Publisher: publish.yml on smarthi/dhurandhar

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: dhurandhar-0.1.1-py3-none-any.whl
- Subject digest: d7894a193034c1116df2f054dfcceaf17d3e06d8999de4167b29af918dfae4d1
- Sigstore transparency entry: 1395947690
- Sigstore integration time: Apr 28, 2026
Source repository:
- Permalink: smarthi/dhurandhar@bd9ec93199d97ca6422360f78e9872ff02d44035
- Branch / Tag: refs/tags/v0.1.1
- Owner: https://github.com/smarthi
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@bd9ec93199d97ca6422360f78e9872ff02d44035
- Trigger Event: release

dhurandhar 0.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

dhurandhar — धुरंधर

What it does

Why this exists

Installation

Quickstart

PLE memory footprint + device feasibility

TurboQuant KV cache compression

Real mmap decode throughput

Codec comparison: TurboQuant vs RotorQuant

LoRA fine-tuning

Interactive dashboard (5 tabs)

Custom model architecture

Custom device profile

Project structure

Testing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance