Skip to main content

cuFile-powered streaming patchers for large model loading

Project description

cufile-patcher

cuFile-first patching toolkit for large model loading using:

  • uv (package manager)
  • Ruff (linting)
  • pytest (testing)
  • GitHub Actions (CI + PyPI publish)
  • plugin-based backend architecture
  • dedicated stream patchers
    • PyTorch
    • TensorFlow
    • safetensors

Project structure

.
├── .github/workflows/
│   ├── ci.yml
│   ├── coverage-pages.yml
│   └── publish.yml
├── .vscode/
│   ├── extensions.json
│   └── settings.json
├── src/cufile_patcher/
│   ├── __init__.py
│   ├── auto_patch.py
│   ├── bindings.py
│   ├── cufile.py
│   ├── cufile_types.py
│   ├── core.py
│   ├── registry.py
│   ├── safetensor_patcher.py
│   ├── service.py
│   ├── tensorflow_patcher.py
│   ├── torch_patcher.py
│   └── plugins/
│       ├── __init__.py
│       ├── base.py
│       └── system.py
├── tests/test_auto_patch.py
├── tests/test_backend_core.py
├── tests/test_core.py
├── tests/test_safetensor_patcher.py
├── tests/test_tensorflow_patcher.py
├── tests/test_torch_patcher.py
├── AGENTS.md
├── README.md
└── pyproject.toml

Quick start

Install dependencies:

uv sync --all-groups

Install package variants:

pip install cufile-patcher
pip install "cufile-patcher[all]"
pip install "cufile-patcher[tf]"
pip install "cufile-patcher[tensorflow]"
pip install "cufile-patcher[torch]"
pip install "cufile-patcher[pytorch]"

Run lint:

uv run ruff check .

Run tests:

uv run pytest

Package function

from cufile_patcher import hello_world

print(hello_world())

Expected output:

Hello, world!

cuFile API (ported)

The package includes a modernized port of cuFile wrapper features:

  • CuFileDriver singleton driver lifecycle
  • CuFile class with mode mapping, open/close, context manager, read/write
  • low-level binding helpers:
    • cuFileDriverOpen, cuFileDriverClose
    • cuFileHandleRegister, cuFileHandleDeregister
    • cuFileRead, cuFileWrite

Plugin architecture

The backend is plugin-based using OOP boundaries:

  • CuFileBackend interface in plugins/base.py
  • SystemCuFileBackend implementation in plugins/system.py
  • BackendRegistry and default backend switching in registry.py

You can register a custom backend for mocks, testing, or alternate transports.

Framework patchers

The project provides dedicated patchers for streaming large model files:

  • patch_torch_load(...) for torch.load
  • patch_tensorflow_load_model(...) for tf.keras.models.load_model
  • patch_safetensor_load_file(...) for safetensors.torch.load_file

Both patchers support:

  • configurable file-size threshold (min_file_size_mb)
  • configurable stream chunking (chunk_size_mb)
  • optional cuFile stream reader (use_cufile=True)
  • fallback to original framework loader if streaming fails

Context manager auto patching

Use a single context manager to install and remove framework patchers automatically:

from cufile_patcher import auto_patch

with auto_patch():
	# existing framework load calls can remain unchanged
	...

Recommended usage

from cufile_patcher import auto_patch

# Recommended default for most projects.
with auto_patch(min_file_size_mb=100, chunk_size_mb=64):
	...

This keeps migrations small because your existing torch.load, tf.keras.models.load_model, and safetensors.torch.load_file calls can stay as-is.

Selection and strict mode

from cufile_patcher import auto_patch

with auto_patch(torch=True, tensorflow=False, safetensors=True):
	...

with auto_patch(strict=True):
	...

Parameters

Parameter Default Meaning
torch None None auto-detects, True requires torch, False disables torch patching
tensorflow None None auto-detects, True requires tensorflow, False disables tensorflow patching
safetensors None None auto-detects, True requires safetensors, False disables safetensors patching
strict False Raise if a required framework is missing
min_file_size_mb 64 Minimum file size to switch from direct load to streaming path
chunk_size_mb 16 Streaming chunk size
use_cufile False Use cuFile reader instead of pure Python reader
fallback_to_original True If streaming fails, fallback to the original framework loader

Migration guidance

If your current code manually installs and uninstalls patchers, move the lifecycle to one block:

from cufile_patcher import auto_patch

def load_models():
	with auto_patch():
		# old loading calls remain unchanged
		...

Notes and caveats

  • with cufile-patcher: is not valid Python syntax.
  • Use with auto_patch(...): instead.
  • None means auto-detect and patch available frameworks.
  • strict=True enforces availability checks for the selected/auto-detected set.
  • Explicit True for a framework raises if that framework is not installed.
  • Patches are process-global monkey patches while the context is active.

PyTorch example

import torch
from cufile_patcher import patch_torch_load

patcher = patch_torch_load(torch, min_file_size_mb=100, chunk_size_mb=64, use_cufile=True)
try:
	model_state = torch.load("/path/to/model.pt", map_location="cpu")
finally:
	patcher.uninstall()

TensorFlow example

import tensorflow as tf
from cufile_patcher import patch_tensorflow_load_model

patcher = patch_tensorflow_load_model(
	tf,
	min_file_size_mb=100,
	chunk_size_mb=64,
	use_cufile=True,
)
try:
	model = tf.keras.models.load_model("/path/to/model.keras")
finally:
	patcher.uninstall()

safetensors example

import safetensors.torch as st
from cufile_patcher import patch_safetensor_load_file

patcher = patch_safetensor_load_file(
	st,
	min_file_size_mb=100,
	chunk_size_mb=64,
	use_cufile=True,
)
try:
	tensors = st.load_file("/path/to/model.safetensors")
finally:
	patcher.uninstall()

Publishing

The publish workflow at .github/workflows/publish.yml is configured to use PyPI trusted publishing.

To use it:

  1. Configure this GitHub repository as a trusted publisher in your PyPI project.
  2. Create and push a tag like v0.1.0.
  3. GitHub Actions will build and publish the package.

Resources

Resource URL
Documentation https://maifeeulasad.github.io/cufile-patcher/
Coverage https://maifeeulasad.github.io/cufile-patcher/htmlcov/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cufile_patcher-0.0.0.tar.gz (82.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cufile_patcher-0.0.0-py3-none-any.whl (16.5 kB view details)

Uploaded Python 3

File details

Details for the file cufile_patcher-0.0.0.tar.gz.

File metadata

  • Download URL: cufile_patcher-0.0.0.tar.gz
  • Upload date:
  • Size: 82.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.22

File hashes

Hashes for cufile_patcher-0.0.0.tar.gz
Algorithm Hash digest
SHA256 eef484d3e924cfda742311ee187cfdc7d0d96f81a58079a0ad27e9322696066f
MD5 df3de17d2f40c3cd6d218365520bc083
BLAKE2b-256 e2ba0e9ee28970e17f891b46243b00fdd3d335c96fc7c6360050f08868df2dff

See more details on using hashes here.

File details

Details for the file cufile_patcher-0.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for cufile_patcher-0.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 43b91ce1e696e35298dbcad15a80dd9c89eda3f4443b924731d4e12be394b892
MD5 001406aa9b20e99489ddf7711c112842
BLAKE2b-256 422161be5cf8ed440a879ea572308c91b6670f730ed5fc749adb85f327bae561

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page