Skip to main content

Efficient caching of Hugging Face models using PyTorch serialisation

Project description

HF Torch Cache

License: MIT Python 3.10+

Efficient caching layer for Hugging Face models using PyTorch serialisation. Accelerate model initialisation while reducing disk redundancy by converting native Hugging Face checkpoints to optimised PyTorch format.

Features

  • 🚀 Faster Initialisation: Skip Hugging Face's config reloading on subsequent loads
  • 💾 Disk Efficiency: Eliminate duplicate storage of model artifacts
  • 🔍 Auto Model Detection: Dynamically selects appropriate model class from config
  • 🧹 Cache Management: Optional cleanup of original Hugging Face cache artifacts
  • 🔒 Safety Controls: Configurable weights-only loading for untrusted sources

Installation

pip install hftorchcache

Usage

Basic Example

from hftc import HFTorchCache

# Initialise cache manager
cache = HFTorchCache()

# Load model with automatic class detection
model, tokenizer = cache.load(
    "unsloth/DeepSeek-R1-Distill-Qwen-7B-bnb-4bit",
    map_location="cuda"
)

Advanced Usage

cache = HFTorchCache(
    cache_dir="/custom/cache/path",  # Default: ~/.cache/hftc
    cleanup_original=True            # Auto-delete original HF cache
)

# Load with explicit device placement and safety controls
model, tokenizer = cache.load(
    "unsloth/DeepSeek-R1-Distill-Qwen-7B-bnb-4bit",
    model_cls="AutoModelForCausalLM",    # Explicit class specification
    tokenizer_cls="AutoTokenizer",
    map_location=torch.device("cuda:0"),
    weights_only=False,                  # Enable for untrusted sources
    local_only=True                      # Prevent HF Hub fallback
    # **model_kwargs                     # Would be passed to `from_pretrained`
)

Note that you may need additional packages (e.g. bitsandbytes) to load cached models.

Cleanup utilities

You can also use the internal _cleanup_hf_cache method to delete the entire model directories of models you're done with, without trying to load them (as long as HuggingFace can find a snapshot).

cache._cleanup_hf_cache("Qwen/Qwen2.5-7B-Instruct-GPTQ-Int8")

API Reference

HFTorchCache

Parameter Type Default Description
cache_dir str ~/.cache/hftc Custom cache directory
cleanup_original bool True Remove original HF cache after conversion

load()

Parameter Type Default Description
model_name str Required HF model identifier
model_cls str/type "auto" Model class specification
tokenizer_cls str/type "auto" Tokenizer class specification
map_location str/device None Torch device placement
weights_only bool False Safe loading for untrusted sources
local_only bool False Disable HF hub fallback

Implementation Notes

  1. First-Run Behavior: Initial load converts HF checkpoint to optimized PyTorch format
  2. Subsequent Loads: Directly loads serialised PyTorch artifacts (3-5x faster)
  3. Device Management: Specify map_location to control device placement
  4. Security: Use weights_only=True when loading untrusted models

License

MIT License - See LICENSE for details


Note: This project is not affiliated with Hugging Face. Use with caution in production environments. Always verify model sources when using weights_only=False.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hftorchcache-0.0.1.tar.gz (8.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hftorchcache-0.0.1-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file hftorchcache-0.0.1.tar.gz.

File metadata

  • Download URL: hftorchcache-0.0.1.tar.gz
  • Upload date:
  • Size: 8.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.22.3 CPython/3.10.16 Linux/6.8.0-51-generic

File hashes

Hashes for hftorchcache-0.0.1.tar.gz
Algorithm Hash digest
SHA256 2d840e22485d7e62d9eeaed23d92721b08eeeea50cdaa3611c0a78ed9910c12a
MD5 2adc46f30497e1681d7bffae15901b67
BLAKE2b-256 7db5b0e8f1f7cb80033b749abedc7317910f1b17d0a59b24d019e183afde68f5

See more details on using hashes here.

File details

Details for the file hftorchcache-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: hftorchcache-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 8.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.22.3 CPython/3.10.16 Linux/6.8.0-51-generic

File hashes

Hashes for hftorchcache-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 05fdeae2064925e7c2d46a1ece414678e56c25e8acaea585d4dafbd33ef4b627
MD5 576a5b12e8a3ba999643a22c40b87f52
BLAKE2b-256 77dd70c93536f2b840cf5a6c44fadcee1b8374e416adfc5a6b04b9e0486e3580

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page