Efficient caching of Hugging Face models using PyTorch serialisation
Project description
HF Torch Cache
Efficient caching layer for Hugging Face models using PyTorch serialisation. Accelerate model initialisation while reducing disk redundancy by converting native Hugging Face checkpoints to optimised PyTorch format.
Features
- 🚀 Faster Initialisation: Skip Hugging Face's config reloading on subsequent loads
- 💾 Disk Efficiency: Eliminate duplicate storage of model artifacts
- 🔍 Auto Model Detection: Dynamically selects appropriate model class from config
- 🧹 Cache Management: Optional cleanup of original Hugging Face cache artifacts
- 🔒 Safety Controls: Configurable weights-only loading for untrusted sources
Installation
pip install hftorchcache
Usage
Basic Example
from hftc import HFTorchCache
# Initialise cache manager
cache = HFTorchCache()
# Load model with automatic class detection
model, tokenizer = cache.load(
"unsloth/DeepSeek-R1-Distill-Qwen-7B-bnb-4bit",
map_location="cuda"
)
Advanced Usage
cache = HFTorchCache(
cache_dir="/custom/cache/path", # Default: ~/.cache/hftc
cleanup_original=True # Auto-delete original HF cache
)
# Load with explicit device placement and safety controls
model, tokenizer = cache.load(
"unsloth/DeepSeek-R1-Distill-Qwen-7B-bnb-4bit",
model_cls="AutoModelForCausalLM", # Explicit class specification
tokenizer_cls="AutoTokenizer",
map_location=torch.device("cuda:0"),
weights_only=False, # Enable for untrusted sources
local_only=True # Prevent HF Hub fallback
# **model_kwargs # Would be passed to `from_pretrained`
)
Note that you may need additional packages (e.g. bitsandbytes) to load cached models.
Cleanup utilities
You can also use the internal _cleanup_hf_cache method to delete the entire model directories of
models you're done with, without trying to load them (as long as HuggingFace can find a snapshot).
cache._cleanup_hf_cache("Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8")
API Reference
HFTorchCache
| Parameter | Type | Default | Description |
|---|---|---|---|
cache_dir |
str | ~/.cache/hftc |
Custom cache directory |
cleanup_original |
bool | True | Remove original HF cache after conversion |
load()
| Parameter | Type | Default | Description |
|---|---|---|---|
model_name |
str | Required | HF model identifier |
model_cls |
str/type | "auto" | Model class specification |
tokenizer_cls |
str/type | "auto" | Tokenizer class specification |
map_location |
str/device | None | Torch device placement |
weights_only |
bool | False | Safe loading for untrusted sources |
local_only |
bool | False | Disable HF hub fallback |
Implementation Notes
- First-Run Behavior: Initial load converts HF checkpoint to optimized PyTorch format
- Subsequent Loads: Directly loads serialised PyTorch artifacts (3-5x faster)
- Device Management: Specify
map_locationto control device placement - Security: Use
weights_only=Truewhen loading untrusted models
License
MIT License - See LICENSE for details
Note: This project is not affiliated with Hugging Face. Use with caution in production environments. Always verify model sources when using weights_only=False.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hftorchcache-0.0.1.tar.gz.
File metadata
- Download URL: hftorchcache-0.0.1.tar.gz
- Upload date:
- Size: 8.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.22.3 CPython/3.10.16 Linux/6.8.0-51-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2d840e22485d7e62d9eeaed23d92721b08eeeea50cdaa3611c0a78ed9910c12a
|
|
| MD5 |
2adc46f30497e1681d7bffae15901b67
|
|
| BLAKE2b-256 |
7db5b0e8f1f7cb80033b749abedc7317910f1b17d0a59b24d019e183afde68f5
|
File details
Details for the file hftorchcache-0.0.1-py3-none-any.whl.
File metadata
- Download URL: hftorchcache-0.0.1-py3-none-any.whl
- Upload date:
- Size: 8.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.22.3 CPython/3.10.16 Linux/6.8.0-51-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
05fdeae2064925e7c2d46a1ece414678e56c25e8acaea585d4dafbd33ef4b627
|
|
| MD5 |
576a5b12e8a3ba999643a22c40b87f52
|
|
| BLAKE2b-256 |
77dd70c93536f2b840cf5a6c44fadcee1b8374e416adfc5a6b04b9e0486e3580
|