Distributed PyTorch file transfer for Baseten - Environment-aware, lock-free file transfer management

These details have not been verified by PyPI

Project links

Project description

B10 Transfer

Accelerate cold starts by loading previous PyTorch compilation artifacts. This library enables caching of torch.compile() results across Baseten deployments, reducing compilation latencies by up to 5x.

Quick Start

For Standard Models (`model.py`)

from b10_transfer import load_compile_cache, save_compile_cache, OperationStatus

class Model:
    def load(self):
        # Load your model first
        self.model = YourModel().to("cuda")
        
        # Try to load existing compile cache
        cache_loaded = load_compile_cache()
        
        if cache_loaded == OperationStatus.ERROR:
            print("Run in eager mode, skipping torch compile")
        else:
            # Compile your model
            self.model = torch.compile(self.model, mode="max-autotune-no-cudagraphs")
            
            # Warm up with representative inputs to trigger compilation
            self.model("dummy input")
            self.model("another dummy input")
            
            # Save cache if it was newly created
            if cache_loaded != OperationStatus.SUCCESS:
                save_compile_cache()

For vLLM Custom Servers

Add to your config.yaml:

requirements:
  - b10-transfer

start_command: "b10-compile-cache & vllm serve ..."

The b10-compile-cache CLI tool automatically handles cache loading and saving for vLLM deployments.

Requirements

Add to your config.yaml:

requirements:
  - b10-transfer

Note: Requires b10cache enabled in your Baseten environment.

API Reference

Core Functions

`load_compile_cache() -> OperationStatus`

Load previously saved compilation cache for the current model environment.

Returns:

OperationStatus.SUCCESS → Cache successfully loaded
OperationStatus.SKIPPED → Cache already exists locally
OperationStatus.ERROR → General errors (b10fs unavailable, validation failed)
OperationStatus.DOES_NOT_EXIST → No cache file found for this environment

`save_compile_cache() -> OperationStatus`

Save the current model's torch compilation cache for future deployments.

Returns:

OperationStatus.SUCCESS → Cache successfully saved
OperationStatus.SKIPPED → Cache already exists in shared directory
OperationStatus.ERROR → General errors (insufficient space, validation failed)

`save_vllm_compile_cache() -> None`

Specialized function for vLLM deployments that:

Attempts to load existing cache first
Waits for vLLM server readiness
Automatically saves cache after compilation

Utility Functions

`clear_local_cache() -> bool`

Clear the local PyTorch compilation cache directory.

Returns:

True → Cache cleared successfully or didn't exist
False → Failed to clear cache

`get_cache_info() -> Dict[str, Any]`

Get comprehensive information about current cache state.

Returns:

{
    "environment_key": str,           # Unique environment identifier
    "local_cache_exists": bool,       # Local torch cache status
    "b10fs_enabled": bool,           # B10FS availability
    "b10fs_cache_exists": bool,      # Remote cache status
    "local_cache_size_mb": float,    # Local cache size (if exists)
    "b10fs_cache_size_mb": float     # Remote cache size (if exists)
}

`list_available_caches() -> Dict[str, Any]`

List all available cache files with metadata.

Returns:

{
    "caches": [                      # List of cache files
        {
            "filename": str,         # Cache file name
            "environment_key": str,  # Environment identifier  
            "size_mb": float,        # File size in MB
            "is_current_environment": bool,  # Matches current env
            "created_time": float    # Creation timestamp
        }
    ],
    "current_environment": str,      # Current environment key
    "total_caches": int,            # Number of cache files
    "current_cache_exists": bool,   # Current env has cache
    "b10fs_enabled": bool          # B10FS availability
}

Constants

`OperationStatus` Enum

Status codes returned by cache operations:

OperationStatus.SUCCESS → Operation completed successfully
OperationStatus.ERROR → Operation failed due to error
OperationStatus.DOES_NOT_EXIST → Cache file not found (load operations only)
OperationStatus.SKIPPED → Operation not needed (cache already exists)

Exceptions

`CacheError`

Base exception for cache operations.

`CacheValidationError`

Raised when path validation or security checks fail.

`CacheOperationInterrupted`

Raised when operations are stopped due to insufficient disk space.

Configuration

The library automatically configures itself, but you can override defaults:

# Cache directories
export TORCHINDUCTOR_CACHE_DIR="/tmp/torchinductor_$(whoami)"
export B10FS_CACHE_DIR="/cache/model/compile_cache"
export LOCAL_WORK_DIR="/app"

# Cache limits
export MAX_CACHE_SIZE_MB="1024"        # 1GB max archive size
export MAX_CONCURRENT_SAVES="50"       # Concurrent save operations

# Required for functionality
export BASETEN_FS_ENABLED="1"

How It Works

Environment-Specific Caching

Caches are automatically keyed by GPU environment to ensure compatibility:

GPU Device Name: e.g., "NVIDIA GeForce RTX 4090"
CUDA Version: e.g., "12.1"

This ensures cached artifacts work correctly across similar hardware configurations.

Atomic Operations

Load: B10FS → local temp → extract to torch cache directory
Save: Compress torch cache → B10FS temp → atomic rename
Space Monitoring: Operations interrupted if disk space insufficient

Debugging

# Enable debug logging
import logging
logging.getLogger('b10_transfer').setLevel(logging.DEBUG)

# Check cache status
info = b10_transfer.get_cache_info()
print(f"Environment: {info['environment_key']}")
print(f"Local cache: {info['local_cache_exists']}")
print(f"Remote cache: {info['b10fs_cache_exists']}")

# List all available caches
caches = b10_transfer.list_available_caches()
print(f"Total caches: {caches['total_caches']}")
for cache in caches['caches']:
    print(f"  {cache['filename']} ({cache['size_mb']:.1f} MB)")

Baseten Documentation

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.17

Sep 23, 2025

This version

0.3.16

Sep 16, 2025

0.3.14

Sep 12, 2025

0.3.13

Sep 11, 2025

0.3.12

Sep 11, 2025

0.3.11

Sep 11, 2025

0.3.10

Sep 11, 2025

0.3.9

Sep 11, 2025

0.3.8

Sep 11, 2025

0.3.7

Sep 11, 2025

0.3.6

Sep 11, 2025

0.3.5

Sep 11, 2025

0.3.4

Sep 11, 2025

0.3.3

Sep 11, 2025

0.3.2

Sep 11, 2025

0.3.1

Sep 9, 2025

0.3.0

Sep 9, 2025

0.2.4

Sep 9, 2025

0.2.3

Sep 9, 2025

0.2.2

Sep 9, 2025

0.2.1

Sep 8, 2025

0.2.0

Sep 8, 2025

0.1.8

Sep 3, 2025

0.1.7

Sep 3, 2025

0.1.6

Sep 2, 2025

0.1.5

Sep 2, 2025

0.1.4

Sep 2, 2025

0.1.3

Sep 2, 2025

0.1.2

Sep 2, 2025

0.1.1

Aug 29, 2025

0.1.0

Aug 29, 2025

0.0.2

Aug 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

b10_transfer-0.3.16.tar.gz (26.7 kB view details)

Uploaded Sep 16, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

b10_transfer-0.3.16-py3-none-any.whl (31.7 kB view details)

Uploaded Sep 16, 2025 Python 3

File details

Details for the file b10_transfer-0.3.16.tar.gz.

File metadata

Download URL: b10_transfer-0.3.16.tar.gz
Upload date: Sep 16, 2025
Size: 26.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.2.0 CPython/3.11.13 Linux/5.15.0-1048-gke

File hashes

Hashes for b10_transfer-0.3.16.tar.gz
Algorithm	Hash digest
SHA256	`56a87320cbd870224048bccd8dc8fe38882b1a0511a1c8e29b31107d119ff40a`
MD5	`3016790b12d65898f3830eb21edb6427`
BLAKE2b-256	`b8f989000b9154f26f8512fbd2f8dcb6fae6a1aecc81910f2241624173626418`

See more details on using hashes here.

File details

Details for the file b10_transfer-0.3.16-py3-none-any.whl.

File metadata

Download URL: b10_transfer-0.3.16-py3-none-any.whl
Upload date: Sep 16, 2025
Size: 31.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.2.0 CPython/3.11.13 Linux/5.15.0-1048-gke

File hashes

Hashes for b10_transfer-0.3.16-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2ae7d9a2b31c4734d1064d2edbc5da424f5e7ea9b79894096902287c8f48facc`
MD5	`7cd8cd63b82f430b0571b6c72dadb286`
BLAKE2b-256	`4671a827c416023534a626b517fc94263bdcb5c59d60852d18e2863ea9e4e38a`

See more details on using hashes here.

b10-transfer 0.3.16

Navigation

Verified details

Project links

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

B10 Transfer

Quick Start

For Standard Models (model.py)

For vLLM Custom Servers

Requirements

API Reference

Core Functions

load_compile_cache() -> OperationStatus

save_compile_cache() -> OperationStatus

save_vllm_compile_cache() -> None

Utility Functions

clear_local_cache() -> bool

get_cache_info() -> Dict[str, Any]

list_available_caches() -> Dict[str, Any]

Constants

OperationStatus Enum

Exceptions

CacheError

CacheValidationError

CacheOperationInterrupted

Configuration

How It Works

Environment-Specific Caching

Atomic Operations

Debugging

Project details

Verified details

Project links

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

For Standard Models (`model.py`)

`load_compile_cache() -> OperationStatus`

`save_compile_cache() -> OperationStatus`

`save_vllm_compile_cache() -> None`

`clear_local_cache() -> bool`

`get_cache_info() -> Dict[str, Any]`

`list_available_caches() -> Dict[str, Any]`

`OperationStatus` Enum

`CacheError`

`CacheValidationError`

`CacheOperationInterrupted`