Skip to main content

Accelerate any PyTorch workload in one line.

Project description

torch-continuum

Accelerate any PyTorch workload in one line.

import torch_continuum
torch_continuum.optimize()

That's it. Your training and inference run faster — automatically tuned for your hardware.

Installation

pip install torch-continuum

# For maximum LLM training speedups:
pip install torch-continuum[liger]

Three Optimization Levels

import torch_continuum

torch_continuum.optimize("safe")    # No precision change — pure speed
torch_continuum.optimize("fast")    # ~2x matmul throughput
torch_continuum.optimize("max")     # Maximum speed — fused kernels + compilation
Level Precision Impact Best For
"safe" None Any workload — risk-free speedup
"fast" Minor (invisible to most models) Training & inference with heavy linear layers
"max" Mixed precision LLM training, large transformers

Benchmarks

Measured on NVIDIA H100 80GB. Real training loop (forward + loss + backward + optimizer step), 5 independent trials, 200 iterations each.

GPT-style Decoder (6 layers, d=768, vocab=32K)

Config Time (200 iters) Speedup
PyTorch baseline 9.622s
torch-continuum "fast" 3.912s +59.3%

Large Linear Stack (67M params, batch 256)

Config Time (200 iters) Speedup
PyTorch baseline 0.900s
torch-continuum "fast" 0.554s +38.4%

CNN / ConvNet (5 layers, 224x224, batch 64)

Config Time (200 iters) Speedup
PyTorch baseline 3.173s
torch-continuum "fast" 1.539s +51.5%

Standard deviations across 5 trials: 0.001–0.004s (highly reproducible).

Smart Compilation

import torch_continuum

model = torch_continuum.smart_compile(model)

Automatically selects the best compilation strategy based on your model size and use case.

Built-in Benchmarking

Test the speedup on your own model:

import torch_continuum

torch_continuum.benchmark(model, example_input, level="fast")

Outputs a side-by-side comparison of baseline PyTorch vs torch-continuum on your exact workload.

Hardware Support

  • NVIDIA GPUs (Ampere, Hopper, Ada): Full acceleration
  • Apple Silicon (M1/M2/M3): Supported
  • CPU: Supported

torch-continuum auto-detects your hardware and applies the right optimizations. No configuration needed.

info = torch_continuum.detect_device()
print(info.summary())

API

Function Description
optimize(level) Apply hardware-tuned optimizations
smart_compile(model) Compile with auto-tuned settings
benchmark(model, input) Measure speedup on your model
detect_device() Get hardware capability profile
apply_liger_kernels() Enable fused kernels for LLM training

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torch_continuum-0.2.0.tar.gz (8.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

torch_continuum-0.2.0-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file torch_continuum-0.2.0.tar.gz.

File metadata

  • Download URL: torch_continuum-0.2.0.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for torch_continuum-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0105836a6b0ab56c755ebd564a875233a15b31eabd07175a765a5aa9247ea453
MD5 c8ea22bee6fbd5c9cb58b0654c87a436
BLAKE2b-256 69de4e6a6107d1d8dc3e583f523574f0fe7f73cccfb83e9ffd43176a6adc2f59

See more details on using hashes here.

Provenance

The following attestation bundles were made for torch_continuum-0.2.0.tar.gz:

Publisher: publish.yml on badaramoni/torch-continuum

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file torch_continuum-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for torch_continuum-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1fb8ae96c8668ecb20674b3eb6075fb693891ce426b78f3800cc08e08ca24929
MD5 8a0dcebe2fcdf15d79f22b5b1f8ce3e6
BLAKE2b-256 81bd5ea51cc3259fa0d6d69be08d642186d9f0a9e28f3cb022c8e0d747541da3

See more details on using hashes here.

Provenance

The following attestation bundles were made for torch_continuum-0.2.0-py3-none-any.whl:

Publisher: publish.yml on badaramoni/torch-continuum

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page