lllm-mem is a lightweight Python CLI tool designed to calculate the GPU VRAM requirements for models on Hugging Face
Project description
llm-mem
llm-mem is a lightweight Python CLI tool designed to calculate the GPU VRAM requirements for models on Hugging Face. It estimates the memory usage based on the model parameters, selected data type, and desired context length. This tool is ideal for developers and researchers looking to optimize model deployment and resource allocation.
Features
- GPU VRAM Estimation: Calculate the VRAM needed for both the model and its key-value cache based on the provided context size.
- Support for Multiple Data Types: Choose from various data types (int4, int8, float8, float16, float32).
- Hugging Face Integration: Automatically fetch model configuration and metadata from Hugging Face repositories.
- Command-Line Interface: Simple and intuitive CLI for quick usage without any additional UI dependencies.
Usage
You can use the llm-mem command from your terminal. The CLI accepts the model ID, data type, and context size as arguments.
Command-Line Arguments
- -m or --model: (Required) The Hugging Face model ID.
- -d or --data-type: The data type for the model (choices: int4, int8, float8, float16, float32). Default is float16.
- -c or --context-size: The context size for the model. Default is 8192.
Example
uvx llm-mem -m google/gemma-3-27b-it -d float16 -c 128000
The above command will display an estimation of the model VRAM, context VRAM, and the total VRAM required.
Installation
You can install the package using Hatchling or your preferred Python package installer if the package is published on PyPI.
# Using pip (if published on PyPI)
uv venv
source .venv/bin/activate
uv pip install llm-mem
# Or clone the repository and install locally
git clone https://github.com/mathiasesn/llm-mem.git
cd llm-mem
pip install .
Directory Structure
├── README.md
├── llm_memory
│ ├── __init__.py
│ ├── cli.py
│ └── llm_memory_calculator.py
└── pyproject.toml
Development
If you wish to contribute or modify the package:
Clone the repository
git clone https://github.com/mathiasesn/llm-mem.git
cd llm-mem
Install development dependencies
uv venv
uv pip install -e .
Run tests
uv run pytest
License
This project is licensed under the terms specified in the LICENSE file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_mem-0.1.1.tar.gz.
File metadata
- Download URL: llm_mem-0.1.1.tar.gz
- Upload date:
- Size: 33.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fa889df45c37fffdc56c8cc7c9256f7cab12b457532fe568b137d0212297d7f9
|
|
| MD5 |
9a1dda849453c1797d73ef594b7cc3b7
|
|
| BLAKE2b-256 |
b31e0362b0ccb8ea0b98ac896593bb14b7359e64216030ffa255b4eedacf91f7
|
Provenance
The following attestation bundles were made for llm_mem-0.1.1.tar.gz:
Publisher:
publish.yaml on mathiasesn/llm-mem
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_mem-0.1.1.tar.gz -
Subject digest:
fa889df45c37fffdc56c8cc7c9256f7cab12b457532fe568b137d0212297d7f9 - Sigstore transparency entry: 191026411
- Sigstore integration time:
-
Permalink:
mathiasesn/llm-mem@7bdefcde6f448b3fb347e7b6860d6091c185c28b -
Branch / Tag:
refs/tags/0.1.1 - Owner: https://github.com/mathiasesn
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@7bdefcde6f448b3fb347e7b6860d6091c185c28b -
Trigger Event:
release
-
Statement type:
File details
Details for the file llm_mem-0.1.1-py3-none-any.whl.
File metadata
- Download URL: llm_mem-0.1.1-py3-none-any.whl
- Upload date:
- Size: 6.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
410de38dd081192ac0a734d319a8e143eadb8da25c6fffe4b8ad8f8a39552691
|
|
| MD5 |
19bca0c7c31e4164408b21105ba58b3e
|
|
| BLAKE2b-256 |
4b82a54205c3b77dcab1070c3e6ef25f0eb9f0f4e7e7522e29a403e4764f25c2
|
Provenance
The following attestation bundles were made for llm_mem-0.1.1-py3-none-any.whl:
Publisher:
publish.yaml on mathiasesn/llm-mem
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
llm_mem-0.1.1-py3-none-any.whl -
Subject digest:
410de38dd081192ac0a734d319a8e143eadb8da25c6fffe4b8ad8f8a39552691 - Sigstore transparency entry: 191026416
- Sigstore integration time:
-
Permalink:
mathiasesn/llm-mem@7bdefcde6f448b3fb347e7b6860d6091c185c28b -
Branch / Tag:
refs/tags/0.1.1 - Owner: https://github.com/mathiasesn
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@7bdefcde6f448b3fb347e7b6860d6091c185c28b -
Trigger Event:
release
-
Statement type: