Skip to main content

Command-line tool for analyzing layer-wise parameters of Hugging Face models including shapes, data types, and memory footprints

Project description

Model Inspect

A command-line tool to analyze and inspect Hugging Face model architectures directly from the repository without downloading the entire model.

Overview

Model Inspect helps you understand the structure of large language models by fetching and analyzing only the model's metadata from the Hugging Face model repository. It extracts information about model layers, their shapes, data types, and sizes without requiring a full model download.

Features

  • Inspects model architecture from Hugging Face repositories
  • Works with sharded models (multiple safetensors files)
  • Shows tensor names, shapes, data types, and memory sizes
  • Calculates total parameter count and model size
  • Supports concurrent processing for faster analysis of sharded models
  • Compatible with Hugging Face mirrors

Installation

pip install model-inspect

Or install from source:

git clone https://github.com/simpx/model-inspect.git
cd model-inspect
pip install -e .

Usage

Basic usage:

model-inspect username/model-name

Example:

model-inspect meta-llama/Llama-2-7b-hf

Command-line Options

usage: model-inspect [-h] [--revision REVISION] [-v] [-j JOBS] [--mirror MIRROR] [--timeout TIMEOUT] [--retries RETRIES] [--backoff BACKOFF] model

Hugging Face Model Layer Analyzer

positional arguments:
  model                 Hugging Face model name (e.g. 'username/model')

optional arguments:
  -h, --help            show this help message and exit
  --revision REVISION   Model revision (default: main)
  -v, --verbose         Enable verbose output (-v for process, -vv for content)
  -j JOBS, --jobs JOBS  Number of concurrent jobs (default: 1)
  --mirror MIRROR       Custom Hugging Face mirror URL (e.g. 'https://hf-mirror.com')
  --timeout TIMEOUT     Request timeout in seconds (default: 30)
  --retries RETRIES     Number of retry attempts (default: 3)
  --backoff BACKOFF     Backoff factor for retries (default: 1)

Examples

Analyze a model with verbose output:

model-inspect meta-llama/Llama-2-7b-hf -v

Use a Hugging Face mirror:

model-inspect meta-llama/Llama-2-7b-hf --mirror https://hf-mirror.com

Use multiple threads for faster processing of sharded models:

model-inspect meta-llama/Llama-2-70b-hf -j 8

Example Output

Here's an example of running model-inspect on DeepSeek-V3:

model-inspect deepseek-ai/DeepSeek-V3

Output:

+---------------------------------------------------------------+----------------+-----------+---------------+
| Layer Name                                                    |     Shape      | Data Type |  Size (bytes) |
+---------------------------------------------------------------+----------------+-----------+---------------+
| model.embed_tokens.weight                                     | (129280, 7168) |    BF16   | 1,853,358,080 |
| model.layers.0.self_attn.q_a_proj.weight                      |  (1536, 7168)  |  F8_E4M3  |    11,010,048 |
| model.layers.0.self_attn.q_a_proj.weight_scale_inv            |    (12, 56)    |    F32    |         2,688 |
| model.layers.0.self_attn.q_a_layernorm.weight                 |    (1536,)     |    BF16   |         3,072 |
| model.layers.0.self_attn.q_b_proj.weight                      | (24576, 1536)  |  F8_E4M3  |    37,748,736 |
| model.layers.0.self_attn.q_b_proj.weight_scale_inv            |   (192, 12)    |    F32    |         9,216 |
| model.layers.0.self_attn.kv_a_proj_with_mqa.weight            |  (576, 7168)   |  F8_E4M3  |     4,128,768 |
| model.layers.0.self_attn.kv_a_proj_with_mqa.weight_scale_inv  |    (5, 56)     |    F32    |         1,120 |
| model.layers.0.self_attn.kv_a_layernorm.weight                |     (512,)     |    BF16   |         1,024 |
| model.layers.0.self_attn.kv_b_proj.weight                     |  (32768, 512)  |  F8_E4M3  |    16,777,216 |
| model.layers.0.self_attn.kv_b_proj.weight_scale_inv           |    (256, 4)    |    F32    |         4,096 |
| model.layers.0.self_attn.o_proj.weight                        | (7168, 16384)  |  F8_E4M3  |   117,440,512 |
| model.layers.0.self_attn.o_proj.weight_scale_inv              |   (56, 128)    |    F32    |        28,672 |
| model.layers.0.mlp.gate_proj.weight                           | (18432, 7168)  |  F8_E4M3  |   132,120,576 |
| model.layers.0.mlp.gate_proj.weight_scale_inv                 |   (144, 56)    |    F32    |        32,256 |
| model.layers.0.mlp.up_proj.weight                             | (18432, 7168)  |  F8_E4M3  |   132,120,576 |
| model.layers.0.mlp.up_proj.weight_scale_inv                   |   (144, 56)    |    F32    |        32,256 |
| model.layers.0.mlp.down_proj.weight                           | (7168, 18432)  |  F8_E4M3  |   132,120,576 |
| model.layers.0.mlp.down_proj.weight_scale_inv                 |   (56, 144)    |    F32    |        32,256 |
| model.layers.0.input_layernorm.weight                         |    (7168,)     |    BF16   |        14,336 |
| model.layers.0.post_attention_layernorm.weight                |    (7168,)     |    BF16   |        14,336 |
| model.layers.1.self_attn.q_a_proj.weight                      |  (1536, 7168)  |  F8_E4M3  |    11,010,048 |
| model.layers.1.self_attn.q_a_proj.weight_scale_inv            |    (12, 56)    |    F32    |         2,688 |
| ... |
| model.layers.61.input_layernorm.weight                        |    (7168,)     |    BF16   |        14,336 |
| model.layers.61.post_attention_layernorm.weight               |    (7168,)     |    BF16   |        14,336 |
| model.layers.61.embed_tokens.weight                           | (129280, 7168) |    BF16   | 1,853,358,080 |
| model.layers.61.enorm.weight                                  |    (7168,)     |    BF16   |        14,336 |
| model.layers.61.hnorm.weight                                  |    (7168,)     |    BF16   |        14,336 |
| model.layers.61.eh_proj.weight                                | (7168, 14336)  |    BF16   |   205,520,896 |
| model.layers.61.shared_head.norm.weight                       |    (7168,)     |    BF16   |        14,336 |
| model.layers.61.shared_head.head.weight                       | (129280, 7168) |    BF16   | 1,853,358,080 |
+---------------------------------------------------------------+----------------+-----------+---------------+

Total Layers: 91991

Total Parameters Size: 688,574,839,360 bytes (656676.14 MB)

How It Works

Model Inspect works by:

  1. Fetching the model's safetensors index file (if available, for sharded models)
  2. Reading only the headers of the safetensors files to extract tensor metadata
  3. Computing tensor sizes based on shapes and data types
  4. Presenting a summary table of all tensors and their properties

This approach avoids downloading the actual model weights, making it much faster and resource-efficient than downloading the entire model.

Requirements

  • Python 3.7+
  • requests
  • prettytable
  • tqdm

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

model_inspect-0.5.2.tar.gz (7.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

model_inspect-0.5.2-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file model_inspect-0.5.2.tar.gz.

File metadata

  • Download URL: model_inspect-0.5.2.tar.gz
  • Upload date:
  • Size: 7.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.2

File hashes

Hashes for model_inspect-0.5.2.tar.gz
Algorithm Hash digest
SHA256 bb1a60dc51353bc1976849306b2049d2d923c6e7234daa4ea991e5cbb3b0b984
MD5 3114aa9dd32da97c2d48415a47c5f760
BLAKE2b-256 7227dcc037ad9cffb4bfd3e0361461793fcffa915381a595177f5d732af44eca

See more details on using hashes here.

File details

Details for the file model_inspect-0.5.2-py3-none-any.whl.

File metadata

  • Download URL: model_inspect-0.5.2-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.2

File hashes

Hashes for model_inspect-0.5.2-py3-none-any.whl
Algorithm Hash digest
SHA256 65ce81e5c76b7d53f5e82f1cc8ba015e7e986cf2d0a0afcb1670a8ba43c71aca
MD5 ba3a44a07cce55c64e89c2660ae5a72f
BLAKE2b-256 317a973f6c09408bcd2d13278e6f4f6bf61b9b51db8ef372082f9576fe4ff489

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page