Skip to main content

Command-line tool for analyzing layer-wise parameters of Hugging Face models including shapes, data types, and memory footprints

Project description

Model Inspect

A command-line tool to analyze and inspect Hugging Face model architectures directly from the repository without downloading the entire model.

Overview

Model Inspect helps you understand the structure of large language models by fetching and analyzing only the model's metadata from the Hugging Face model repository. It extracts information about model layers, their shapes, data types, and sizes without requiring a full model download.

Features

  • Inspects model architecture from Hugging Face repositories
  • Works with sharded models (multiple safetensors files)
  • Shows tensor names, shapes, data types, and memory sizes
  • Calculates total parameter count and model size
  • Supports concurrent processing for faster analysis of sharded models
  • Compatible with Hugging Face mirrors

Installation

pip install model-inspect

Or install from source:

git clone https://github.com/simpx/model-inspect.git
cd model-inspect
pip install -e .

Usage

Basic usage:

model-inspect username/model-name

Example:

model-inspect meta-llama/Llama-2-7b-hf

Command-line Options

usage: model-inspect [-h] [--revision REVISION] [-v] [-j JOBS] [--mirror MIRROR] [--timeout TIMEOUT] [--retries RETRIES] [--backoff BACKOFF] model

Hugging Face Model Layer Analyzer

positional arguments:
  model                 Hugging Face model name (e.g. 'username/model')

optional arguments:
  -h, --help            show this help message and exit
  --revision REVISION   Model revision (default: main)
  -v, --verbose         Enable verbose output (-v for process, -vv for content)
  -j JOBS, --jobs JOBS  Number of concurrent jobs (default: 1)
  --mirror MIRROR       Custom Hugging Face mirror URL (e.g. 'https://hf-mirror.com')
  --timeout TIMEOUT     Request timeout in seconds (default: 30)
  --retries RETRIES     Number of retry attempts (default: 3)
  --backoff BACKOFF     Backoff factor for retries (default: 1)

Examples

Analyze a model with verbose output:

model-inspect meta-llama/Llama-2-7b-hf -v

Use a Hugging Face mirror:

model-inspect meta-llama/Llama-2-7b-hf --mirror https://hf-mirror.com

Use multiple threads for faster processing of sharded models:

model-inspect meta-llama/Llama-2-70b-hf -j 8

Example Output

Here's an example of running model-inspect on DeepSeek-V3:

model-inspect deepseek-ai/DeepSeek-V3

Output:

+---------------------------------------------------------------+----------------+-----------+---------------+
| Layer Name                                                    |     Shape      | Data Type |  Size (bytes) |
+---------------------------------------------------------------+----------------+-----------+---------------+
| model.embed_tokens.weight                                     | (129280, 7168) |    BF16   | 1,853,358,080 |
| model.layers.0.self_attn.q_a_proj.weight                      |  (1536, 7168)  |  F8_E4M3  |    11,010,048 |
| model.layers.0.self_attn.q_a_proj.weight_scale_inv            |    (12, 56)    |    F32    |         2,688 |
| model.layers.0.self_attn.q_a_layernorm.weight                 |    (1536,)     |    BF16   |         3,072 |
| model.layers.0.self_attn.q_b_proj.weight                      | (24576, 1536)  |  F8_E4M3  |    37,748,736 |
| model.layers.0.self_attn.q_b_proj.weight_scale_inv            |   (192, 12)    |    F32    |         9,216 |
| model.layers.0.self_attn.kv_a_proj_with_mqa.weight            |  (576, 7168)   |  F8_E4M3  |     4,128,768 |
| model.layers.0.self_attn.kv_a_proj_with_mqa.weight_scale_inv  |    (5, 56)     |    F32    |         1,120 |
| model.layers.0.self_attn.kv_a_layernorm.weight                |     (512,)     |    BF16   |         1,024 |
| model.layers.0.self_attn.kv_b_proj.weight                     |  (32768, 512)  |  F8_E4M3  |    16,777,216 |
| model.layers.0.self_attn.kv_b_proj.weight_scale_inv           |    (256, 4)    |    F32    |         4,096 |
| model.layers.0.self_attn.o_proj.weight                        | (7168, 16384)  |  F8_E4M3  |   117,440,512 |
| model.layers.0.self_attn.o_proj.weight_scale_inv              |   (56, 128)    |    F32    |        28,672 |
| model.layers.0.mlp.gate_proj.weight                           | (18432, 7168)  |  F8_E4M3  |   132,120,576 |
| model.layers.0.mlp.gate_proj.weight_scale_inv                 |   (144, 56)    |    F32    |        32,256 |
| model.layers.0.mlp.up_proj.weight                             | (18432, 7168)  |  F8_E4M3  |   132,120,576 |
| model.layers.0.mlp.up_proj.weight_scale_inv                   |   (144, 56)    |    F32    |        32,256 |
| model.layers.0.mlp.down_proj.weight                           | (7168, 18432)  |  F8_E4M3  |   132,120,576 |
| model.layers.0.mlp.down_proj.weight_scale_inv                 |   (56, 144)    |    F32    |        32,256 |
| model.layers.0.input_layernorm.weight                         |    (7168,)     |    BF16   |        14,336 |
| model.layers.0.post_attention_layernorm.weight                |    (7168,)     |    BF16   |        14,336 |
| model.layers.1.self_attn.q_a_proj.weight                      |  (1536, 7168)  |  F8_E4M3  |    11,010,048 |
| model.layers.1.self_attn.q_a_proj.weight_scale_inv            |    (12, 56)    |    F32    |         2,688 |
| ... |
| model.layers.61.input_layernorm.weight                        |    (7168,)     |    BF16   |        14,336 |
| model.layers.61.post_attention_layernorm.weight               |    (7168,)     |    BF16   |        14,336 |
| model.layers.61.embed_tokens.weight                           | (129280, 7168) |    BF16   | 1,853,358,080 |
| model.layers.61.enorm.weight                                  |    (7168,)     |    BF16   |        14,336 |
| model.layers.61.hnorm.weight                                  |    (7168,)     |    BF16   |        14,336 |
| model.layers.61.eh_proj.weight                                | (7168, 14336)  |    BF16   |   205,520,896 |
| model.layers.61.shared_head.norm.weight                       |    (7168,)     |    BF16   |        14,336 |
| model.layers.61.shared_head.head.weight                       | (129280, 7168) |    BF16   | 1,853,358,080 |
+---------------------------------------------------------------+----------------+-----------+---------------+

Total Layers: 91991

Total Parameters Size: 688,574,839,360 bytes (656676.14 MB)

How It Works

Model Inspect works by:

  1. Fetching the model's safetensors index file (if available, for sharded models)
  2. Reading only the headers of the safetensors files to extract tensor metadata
  3. Computing tensor sizes based on shapes and data types
  4. Presenting a summary table of all tensors and their properties

This approach avoids downloading the actual model weights, making it much faster and resource-efficient than downloading the entire model.

Requirements

  • Python 3.7+
  • requests
  • prettytable
  • tqdm

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

model_inspect-0.5.1.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

model_inspect-0.5.1-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file model_inspect-0.5.1.tar.gz.

File metadata

  • Download URL: model_inspect-0.5.1.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.2

File hashes

Hashes for model_inspect-0.5.1.tar.gz
Algorithm Hash digest
SHA256 dcae4d99a2d0e8cee488dcb5ad3fe9f480729a4e9eede67eadd091f17f54243f
MD5 f400fc474ba07480a1c9776530aa5072
BLAKE2b-256 bc334b79fd3e8c796116b4e6578d903171f6989b9a6f754356818aa7ce0cf88b

See more details on using hashes here.

File details

Details for the file model_inspect-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: model_inspect-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.2

File hashes

Hashes for model_inspect-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e95bf3820a7198a28a140bba05f390e1ee2902d4ba906d14f9054291f60c8ba3
MD5 eb3f51f93c71da99fb3b77a9d7b7bb04
BLAKE2b-256 588fd99f89505bb0d664e2b293b22d7da01bb373318b0cf8ce622ed8fdd78c5c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page