Efficient caching of Hugging Face models using PyTorch serialisation

Project description

HF Torch Cache

Efficient caching layer for Hugging Face models using PyTorch serialisation. Accelerate model initialisation while reducing disk redundancy by converting native Hugging Face checkpoints to optimised PyTorch format.

Features

🚀 Faster Initialisation: Skip Hugging Face's config reloading on subsequent loads
💾 Disk Efficiency: Eliminate duplicate storage of model artifacts
🔍 Auto Model Detection: Dynamically selects appropriate model class from config
🧹 Cache Management: Optional cleanup of original Hugging Face cache artifacts
🔒 Safety Controls: Configurable weights-only loading for untrusted sources

Installation

pip install hftorchcache

Usage

Simply pass a model name (the HuggingFace repo ID) and the model will be loaded and saved as a torch .pt file in ~/.cache/hftc/.

from hftc import HFTorchCache

MODEL_NAME = "unsloth/DeepSeek-R1-Distill-Qwen-7B-bnb-4bit"

# Initialise cache manager
cache = HFTorchCache()

# Load model with automatic class detection
model, tokenizer = cache.load(MODEL_NAME)

print(model.device) # "cuda" if GPU available

If it's already been cached, it'll load instantly

There are also options to:

Delete the original HuggingFace model cache directory (to avoid duplicates)
Only load from the local HuggingFace cache
Specify a value to pass as the device (torch.load defaults to using GPU if available)
Loading weights only (but this defeats the purpose of this method, which is to load the entire variable fast, like pickle)
Specify a particular model class by name or by the type itself (it should detect the model automatically)

cache = HFTorchCache(
    cache_dir="/custom/cache/path",  # Default: ~/.cache/hftc
    cleanup_original=True            # Auto-delete original HF cache
)

# Load with explicit device placement and safety controls
model, tokenizer = cache.load(
    "unsloth/DeepSeek-R1-Distill-Qwen-7B-bnb-4bit",
    model_cls="AutoModelForCausalLM",    # Explicit class specification
    tokenizer_cls="AutoTokenizer",
    map_location=torch.device("cuda:0"),
    weights_only=False,                  # Enable for untrusted sources
    local_only=True                      # Prevent HF Hub fallback
    # **model_kwargs                     # Would be passed to `from_pretrained`
)

Note that you may need additional packages (e.g. bitsandbytes) to load cached models. Accelerate is a dependency of this package, and low_cpu_mem_usage is always passed as True to from_pretrained.

Cleanup utilities

You can also use the internal _cleanup_hf_cache method to delete the entire model directories of models you're done with, without trying to load them (as long as HuggingFace can find a snapshot).

cache._cleanup_hf_cache("Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8")

API Reference

`HFTorchCache`

Parameter	Type	Default	Description
`cache_dir`	str	`~/.cache/hftc`	Custom cache directory
`cleanup_original`	bool	True	Remove original HF cache after conversion

`load()`

Parameter	Type	Default	Description
`model_name`	str	Required	HF model identifier
`model_cls`	str/type	"auto"	Model class specification
`tokenizer_cls`	str/type	"auto"	Tokenizer class specification
`map_location`	str/device	None	Torch device placement
`weights_only`	bool	False	Safe loading for untrusted sources
`local_only`	bool	False	Disable HF hub fallback

Implementation Notes

First-Run Behavior: Initial load converts HF checkpoint to optimized PyTorch format
Subsequent Loads: Directly loads serialised PyTorch artifacts (3-5x faster)
Device Management: Specify map_location to control device placement
Security: Use weights_only=True when loading untrusted models

License

MIT License - See LICENSE for details

Note: This project is not affiliated with Hugging Face. Use with caution in production environments. Always verify model sources when using weights_only=False.

Project details

Release history Release notifications | RSS feed

This version

0.0.2

Jan 30, 2025

0.0.1

Jan 30, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hftorchcache-0.0.2.tar.gz (5.9 kB view details)

Uploaded Jan 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hftorchcache-0.0.2-py3-none-any.whl (6.9 kB view details)

Uploaded Jan 30, 2025 Python 3

File details

Details for the file hftorchcache-0.0.2.tar.gz.

File metadata

Download URL: hftorchcache-0.0.2.tar.gz
Upload date: Jan 30, 2025
Size: 5.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: pdm/2.22.3 CPython/3.10.16 Linux/6.8.0-51-generic

File hashes

Hashes for hftorchcache-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`63630060fd1856c2bc8e98982ed1f5cc7c0e09af1f8392adcee496d9d286c032`
MD5	`0dc56dbd985d9c4b911c0f1bd6c3a613`
BLAKE2b-256	`98dd6c6e7d3a59c10f3f1c546750bc939447528de4b4c4599af63d97f995dc39`

See more details on using hashes here.

File details

Details for the file hftorchcache-0.0.2-py3-none-any.whl.

File metadata

Download URL: hftorchcache-0.0.2-py3-none-any.whl
Upload date: Jan 30, 2025
Size: 6.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: pdm/2.22.3 CPython/3.10.16 Linux/6.8.0-51-generic

File hashes

Hashes for hftorchcache-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`539f9a20aad138014b45f39a1cb0ec48e98a5e713fabae2bf4a0b6b405aa3f7f`
MD5	`73c3f36e128f3b9ce3ce2c612c5fc3db`
BLAKE2b-256	`4ba47eeea1a06885936294cb9d2f4d7ff8a4bd5529d81dece0282e5df0afa3d3`

See more details on using hashes here.

hftorchcache 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

HF Torch Cache

Features

Installation

Usage

Cleanup utilities

API Reference

`HFTorchCache`

`load()`

Implementation Notes

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes