cuFile-powered streaming patchers for large model loading
Project description
cufile-patcher
cuFile-first patching toolkit for large model loading using:
- uv (package manager)
- Ruff (linting)
- pytest (testing)
- GitHub Actions (CI + PyPI publish)
- plugin-based backend architecture
- dedicated stream patchers
- PyTorch
- TensorFlow
- safetensors
Project structure
.
├── .github/workflows/
│ ├── ci.yml
│ ├── coverage-pages.yml
│ └── publish.yml
├── .vscode/
│ ├── extensions.json
│ └── settings.json
├── src/cufile_patcher/
│ ├── __init__.py
│ ├── auto_patch.py
│ ├── bindings.py
│ ├── cufile.py
│ ├── cufile_types.py
│ ├── core.py
│ ├── registry.py
│ ├── safetensor_patcher.py
│ ├── service.py
│ ├── tensorflow_patcher.py
│ ├── torch_patcher.py
│ └── plugins/
│ ├── __init__.py
│ ├── base.py
│ └── system.py
├── tests/test_auto_patch.py
├── tests/test_backend_core.py
├── tests/test_core.py
├── tests/test_safetensor_patcher.py
├── tests/test_tensorflow_patcher.py
├── tests/test_torch_patcher.py
├── AGENTS.md
├── README.md
└── pyproject.toml
Quick start
Install dependencies:
uv sync --all-groups
Install package variants:
pip install cufile-patcher
pip install "cufile-patcher[all]"
pip install "cufile-patcher[tf]"
pip install "cufile-patcher[tensorflow]"
pip install "cufile-patcher[torch]"
pip install "cufile-patcher[pytorch]"
Run lint:
uv run ruff check .
Run tests:
uv run pytest
Package function
from cufile_patcher import hello_world
print(hello_world())
Expected output:
Hello, world!
cuFile API (ported)
The package includes a modernized port of cuFile wrapper features:
CuFileDriversingleton driver lifecycleCuFileclass with mode mapping, open/close, context manager, read/write- low-level binding helpers:
cuFileDriverOpen,cuFileDriverClosecuFileHandleRegister,cuFileHandleDeregistercuFileRead,cuFileWrite
Plugin architecture
The backend is plugin-based using OOP boundaries:
CuFileBackendinterface inplugins/base.pySystemCuFileBackendimplementation inplugins/system.pyBackendRegistryand default backend switching inregistry.py
You can register a custom backend for mocks, testing, or alternate transports.
Framework patchers
The project provides dedicated patchers for streaming large model files:
patch_torch_load(...)fortorch.loadpatch_tensorflow_load_model(...)fortf.keras.models.load_modelpatch_safetensor_load_file(...)forsafetensors.torch.load_file
Both patchers support:
- configurable file-size threshold (
min_file_size_mb) - configurable stream chunking (
chunk_size_mb) - optional cuFile stream reader (
use_cufile=True) - fallback to original framework loader if streaming fails
Context manager auto patching
Use a single context manager to install and remove framework patchers automatically:
from cufile_patcher import auto_patch
with auto_patch():
# existing framework load calls can remain unchanged
...
Recommended usage
from cufile_patcher import auto_patch
# Recommended default for most projects.
with auto_patch(min_file_size_mb=100, chunk_size_mb=64):
...
This keeps migrations small because your existing torch.load,
tf.keras.models.load_model, and safetensors.torch.load_file calls can stay as-is.
Selection and strict mode
from cufile_patcher import auto_patch
with auto_patch(torch=True, tensorflow=False, safetensors=True):
...
with auto_patch(strict=True):
...
Parameters
| Parameter | Default | Meaning |
|---|---|---|
torch |
None |
None auto-detects, True requires torch, False disables torch patching |
tensorflow |
None |
None auto-detects, True requires tensorflow, False disables tensorflow patching |
safetensors |
None |
None auto-detects, True requires safetensors, False disables safetensors patching |
strict |
False |
Raise if a required framework is missing |
min_file_size_mb |
64 |
Minimum file size to switch from direct load to streaming path |
chunk_size_mb |
16 |
Streaming chunk size |
use_cufile |
False |
Use cuFile reader instead of pure Python reader |
fallback_to_original |
True |
If streaming fails, fallback to the original framework loader |
Migration guidance
If your current code manually installs and uninstalls patchers, move the lifecycle to one block:
from cufile_patcher import auto_patch
def load_models():
with auto_patch():
# old loading calls remain unchanged
...
Notes and caveats
with cufile-patcher:is not valid Python syntax.- Use
with auto_patch(...):instead. Nonemeans auto-detect and patch available frameworks.strict=Trueenforces availability checks for the selected/auto-detected set.- Explicit
Truefor a framework raises if that framework is not installed. - Patches are process-global monkey patches while the context is active.
PyTorch example
import torch
from cufile_patcher import patch_torch_load
patcher = patch_torch_load(torch, min_file_size_mb=100, chunk_size_mb=64, use_cufile=True)
try:
model_state = torch.load("/path/to/model.pt", map_location="cpu")
finally:
patcher.uninstall()
TensorFlow example
import tensorflow as tf
from cufile_patcher import patch_tensorflow_load_model
patcher = patch_tensorflow_load_model(
tf,
min_file_size_mb=100,
chunk_size_mb=64,
use_cufile=True,
)
try:
model = tf.keras.models.load_model("/path/to/model.keras")
finally:
patcher.uninstall()
safetensors example
import safetensors.torch as st
from cufile_patcher import patch_safetensor_load_file
patcher = patch_safetensor_load_file(
st,
min_file_size_mb=100,
chunk_size_mb=64,
use_cufile=True,
)
try:
tensors = st.load_file("/path/to/model.safetensors")
finally:
patcher.uninstall()
Publishing
The publish workflow at .github/workflows/publish.yml is configured to use PyPI trusted publishing.
To use it:
- Configure this GitHub repository as a trusted publisher in your PyPI project.
- Create and push a tag like
v0.1.0. - GitHub Actions will build and publish the package.
Resources
| Resource | URL |
|---|---|
| Documentation | https://maifeeulasad.github.io/cufile-patcher/ |
| Coverage | https://maifeeulasad.github.io/cufile-patcher/htmlcov/ |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cufile_patcher-0.1.0.tar.gz.
File metadata
- Download URL: cufile_patcher-0.1.0.tar.gz
- Upload date:
- Size: 82.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3358ddef968cb202e249c83124586f557f048fbd0a02b765c086aa39a21f621f
|
|
| MD5 |
881f2a20fc3382ece3c15bacaec4d4aa
|
|
| BLAKE2b-256 |
ee9001c30a382220bf1e5357d6eb9a7f8bebd2dbdc5fb7695371a434db57e409
|
File details
Details for the file cufile_patcher-0.1.0-py3-none-any.whl.
File metadata
- Download URL: cufile_patcher-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a76a5909a37b6ad239ca7d64c4fc24daecd65a47fe47f3ae696b175296cb365e
|
|
| MD5 |
e557179c92fd659f9bce0666738de4fb
|
|
| BLAKE2b-256 |
14c00d91df6de56166968665116d37bc8fdbbba4505289f622369eb812a0e219
|