Publication-quality LLM architecture diagrams, fact sheets, and diffs from HuggingFace config.json — no weights needed

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

Hesham34

These details have not been verified by PyPI

Project description

llmviz

Publication-quality LLM architecture figures from config.json alone — no weights, no GPU, no transformers install.

Point it at any model — a Hugging Face id, a local GGUF file, or a model installed in Ollama — and get a hand-drawn-quality architecture figure in the visual language of Sebastian Raschka's LLM Architecture Gallery: the decoder tower with dotted-leader callouts, the MoE router inset, the SwiGLU module, and parameter counts computed from the config, not scraped from a model card.

DeepSeek-V3 architecture

Why

Hand-drawn galleries are wonderful and update episodically. llmviz generates the same figure in milliseconds, for any model, the day its config lands on the Hub — including architectures that didn't exist when this tool was written. The parser is generic-first (field-name synonyms, capability detection, graceful degradation), so hybrid Mamba mixers, linear attention, MLA, sandwich norms, and whatever ships next all render correctly or degrade honestly.

The numbers are the point: total and active parameters, KV-cache bytes per token, and VRAM footprints are reconstructed from per-layer math. The test suite pins them against published figures — Llama-3-8B at 8.03B, DeepSeek-V3 at 671B/37.5B, Qwen3-235B-A22B at 235B/22B — within 0.5–3%. When Kimi-Linear-48B-A3B (a model the code was never tuned for) parses to 48.9B total / 3.3B active, you know the math is doing the work.

Install

pip install llmviz                # SVG figures
pip install "llmviz[png]"         # + PNG export (cairosvg)
pip install "llmviz[explain]"     # + LLM-written notes (LiteLLM: Ollama, llama.cpp, any provider)
pip install "llmviz[mcp]"         # + MCP server for agents

Sixty seconds

llmviz render deepseek-ai/DeepSeek-V3            # the figure above → DeepSeek-V3.svg
llmviz render ollama:deepseek-r1                 # a model installed in YOUR Ollama
llmviz inspect ./model-q4.gguf                   # any local or remote .gguf (header-only read)
llmviz fit Qwen/Qwen3-235B-A22B -c 131072        # can I run it? fp16/q8/q4 + your GPU verdict
llmviz diff deepseek-ai/DeepSeek-V3 NousResearch/Meta-Llama-3-8B

Commands

Command	What it produces
`render <model>`	The architecture figure (SVG/PNG). `--animate` adds a staggered build-up.
`diff <a> <b>`	Two towers side by side + a comparison table with every difference flagged
`lineage <m1> <m2> …`	Family evolution strip with per-generation "what changed" deltas
`card <model>`	1200×630 social card — headline stats + the tower, share-ready
`poster models.yaml`	Print-ready grid of towers on one sheet (`--cols`, `--title`)
`gallery models.yaml`	Self-contained static HTML gallery with search/sort. `--space user/name` deploys it to a free HF Space
`watch`	Gallery of the Hub's trending models right now — pair with `--space` on a cron for a self-updating public gallery
`inspect <model>`	The normalized fact sheet as a terminal table
`fit <model>`	Quantization-aware memory needs (weights + KV cache) and which GPUs fit — detects your local GPU via `nvidia-smi`
`explain <model>`	Five LLM-written notes on what's architecturally notable — local-first via Ollama
`mcp`	MCP server (stdio): `inspect_architecture`, `memory_to_run`, `render_architecture_figure`, `diff_architectures` as agent tools

Every command accepts a Hugging Face id (org/name), a local config.json path, a .gguf file or URL, or ollama:<name>. Gated repos (meta-llama, google) need --token or hf auth login.

DeepSeek-V3 vs Llama-3-8B

What it reads

Signal	Source fields
MHA / GQA / MQA	`num_key_value_heads` vs `num_attention_heads`, `multi_query`
MLA (DeepSeek-style latent KV)	`kv_lora_rank`, `q_lora_rank`, decoupled-RoPE head dims
MoE	`num_experts` / `n_routed_experts` / `num_local_experts`, five spellings of top-k, shared experts, leading dense layers
Hybrid token mixers	`layer_types`, `linear_attn_config`, `full_attn_idxs`, `attn_type_list`, `mamba_*` — summarized as e.g. "20 linear-attention (KDA) : 7 full attention layers"
Norm placement	pre (default), post (OLMo-2), sandwich (Gemma) — drawn structurally
The rest	sliding windows and local:global ratios, QK-norm, RoPE θ / ALiBi / learned, tied embeddings, activation

GGUF sources are read header-only (a few MB, never the weights), including vocab size recovered from the tokenizer array length — so llmviz render ollama:qwen2.5-coder:14b diagrams a 9 GB model in under a second. For remote GGUF URLs only the metadata bytes are fetched via ranged HTTP.

Counting convention: "active" means every parameter touched in a forward pass, including embeddings and the LM head — some vendors report actives excluding the unembedding, so their number may read slightly lower.

`explain` providers

explain is local-first through LiteLLM:

llmviz explain zai-org/GLM-4.5-Air                                    # local Ollama (auto-picks an installed model)
llmviz explain <m> --llm openai/local --api-base http://localhost:8080/v1   # llama.cpp server
llmviz explain <m> --llm groq/llama-3.3-70b-versatile                 # any hosted LiteLLM provider
export LLMVIZ_LLM="ollama/deepseek-r1:latest"                         # set your default

Reasoning models (DeepSeek-R1, Qwen3) are handled — thinking is stripped, the answer is kept.

MCP

Give any agent architecture facts computed from configs instead of recalled from training data:

{"mcpServers": {"llmviz": {"command": "llmviz", "args": ["mcp"]}}}

Python API

from llmviz.fetch import load_spec
from llmviz.render.block import render_model

spec = load_spec("Qwen/Qwen3-235B-A22B")      # or "ollama:deepseek-r1", "./model.gguf"
spec.total_params, spec.active_params, spec.attention.kind, spec.hybrid_note
svg = render_model(spec)

ArchSpec is a Pydantic model — spec.model_dump_json() gives you the normalized architecture for your own tooling.

Development

git clone https://github.com/h9-tec/llmviz && cd llmviz
python -m venv .venv && .venv/bin/pip install -e ".[dev,png]"
.venv/bin/pytest                              # offline; fixtures are real Hub configs
.venv/bin/ruff check src tests

Tests treat published parameter counts as ground truth — if the per-layer math doesn't reproduce a model's documented size, the parser is wrong. CI runs on 3.11/3.12; a nightly workflow rebuilds the trending gallery and deploys it to a Hugging Face Space; tagging v* publishes to PyPI via trusted publishing.

Acknowledgements

The visual language is a faithful implementation of Sebastian Raschka's LLM Architecture Gallery figures (sebastianraschka.com/llm-architecture-gallery) — colors were sampled from his published figures with admiration. If you want the hand-crafted originals with his commentary, go read Ahead of AI.

License

Apache-2.0

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

Hesham34

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.1

Jul 4, 2026

This version

0.1.0

Jul 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmviz-0.1.0.tar.gz (51.6 kB view details)

Uploaded Jul 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llmviz-0.1.0-py3-none-any.whl (43.6 kB view details)

Uploaded Jul 4, 2026 Python 3

File details

Details for the file llmviz-0.1.0.tar.gz.

File metadata

Download URL: llmviz-0.1.0.tar.gz
Upload date: Jul 4, 2026
Size: 51.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llmviz-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`37352f7e103b0fea5b67098bd410fc43dcf7f0d3a828e7f1a600205d70bb7202`
MD5	`73121c35a1130c3ce6a495c0c92feaae`
BLAKE2b-256	`e584441fb41a981a31cef2a33fdc7900ebde98b0eb72798b8841eb90abbbdaef`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llmviz-0.1.0.tar.gz:

Publisher: release.yml on h9-tec/llmviz

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llmviz-0.1.0.tar.gz
- Subject digest: 37352f7e103b0fea5b67098bd410fc43dcf7f0d3a828e7f1a600205d70bb7202
- Sigstore transparency entry: 2071062484
- Sigstore integration time: Jul 4, 2026
Source repository:
- Permalink: h9-tec/llmviz@63748d318c4dedbc01e33a94c60eb4f4feb3bc47
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/h9-tec
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@63748d318c4dedbc01e33a94c60eb4f4feb3bc47
- Trigger Event: push

File details

Details for the file llmviz-0.1.0-py3-none-any.whl.

File metadata

Download URL: llmviz-0.1.0-py3-none-any.whl
Upload date: Jul 4, 2026
Size: 43.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for llmviz-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1a0df5b6cfd90263ce7dadc49fb7c2e6074fd05568beabd8a5fe10d31d0c9a01`
MD5	`eff77e153ff84f184740e94877512670`
BLAKE2b-256	`529d3f1b86eda0f42b2dbdc286c16f72bb2ebc4b29e414ff13a12dcd65b72da4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for llmviz-0.1.0-py3-none-any.whl:

Publisher: release.yml on h9-tec/llmviz

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: llmviz-0.1.0-py3-none-any.whl
- Subject digest: 1a0df5b6cfd90263ce7dadc49fb7c2e6074fd05568beabd8a5fe10d31d0c9a01
- Sigstore transparency entry: 2071062569
- Sigstore integration time: Jul 4, 2026
Source repository:
- Permalink: h9-tec/llmviz@63748d318c4dedbc01e33a94c60eb4f4feb3bc47
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/h9-tec
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@63748d318c4dedbc01e33a94c60eb4f4feb3bc47
- Trigger Event: push

llmviz 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

llmviz

Why

Install

Sixty seconds

Commands

What it reads

explain providers

MCP

Python API

Development

Acknowledgements

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`explain` providers