High-performance zero-copy tensor protocol
Project description
Tenso
Up to 23.8x faster than Apache Arrow. 61x less CPU than SafeTensors.
Zero-copy, SIMD-aligned tensor protocol for high-performance ML infrastructure.
Why Tenso?
Most serialization formats are designed for general data or disk storage. Tenso is focused on network tensor transmission where every microsecond matters.
The Problem
Traditional formats waste CPU cycles:
- SafeTensors: 36.7% CPU usage per deserialization (great for disk, overkill for network)
- Pickle: 41.7% CPU usage + security vulnerabilities
- Arrow: Fast, but 23.8x slower than Tenso for large tensors
The Solution
Tenso achieves true zero-copy with:
- Fixed 8-byte header (no JSON parsing overhead)
- 64-byte memory alignment (SIMD-ready)
- Direct memory mapping (CPU just points, never copies)
Result: 0.6% CPU usage vs 36.7% for SafeTensors
Benchmarks
System: Python 3.12.9, NumPy 2.3.5, 12 CPU cores, macOS
Deserialization Speed (8192×8192 Float32 Matrix)
| Format | Time | CPU Usage | Speedup |
|---|---|---|---|
| Tenso | 0.034ms | 0.6% | 1x |
| Arrow | 0.805ms | 1.1% | 23.8x slower |
| SafeTensors | 2.621ms | 36.7% | 77x slower |
| Pickle | 3.293ms | 41.7% | 97x slower |
Stream Reading Performance (95MB Packet)
| Method | Time | Throughput | Speedup |
|---|---|---|---|
| Tenso read_stream | 21ms | 4,500 MB/s | 1x |
| Naive loop | 7,870ms | 12 MB/s | 371x slower |
Network Latency (1KB Tensor over TCP)
| Metric | Value |
|---|---|
| Throughput | 182,940 packets/sec |
| Latency | 5.5 μs/packet |
Real-World Impact
Scenario: Inference API serving 10,000 req/sec with 64MB tensors
| Format | CPU Cores Used | Monthly Cost* |
|---|---|---|
| SafeTensors | 367 cores | ~$15,000 |
| Tenso | 6 cores | ~$245 |
*Based on typical cloud compute pricing
Installation
pip install tenso
Quick Start
Basic Serialization
import numpy as np
import tenso
# Create tensor
data = np.random.rand(1024, 1024).astype(np.float32)
# Serialize (8.5ms for 64MB)
packet = tenso.dumps(data)
# Deserialize (0.034ms for 64MB)
restored = tenso.loads(packet)
Network Communication
import socket
import tenso
# Server: Receive tensor
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind(('0.0.0.0', 9999))
server.listen(1)
conn, addr = server.accept()
# Zero-copy read with automatic buffering
tensor = tenso.read_stream(conn) # Uses readinto() internally
# Client: Send tensor
client = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
client.connect(('localhost', 9999))
data = np.random.rand(256, 256).astype(np.float32)
tenso.write_stream(data, client) # Atomic write with os.writev
File I/O with Memory Mapping
# Write to disk
with open("model_weights.tenso", "wb") as f:
tenso.dump(large_tensor, f)
# Instant load (no matter the size)
with open("model_weights.tenso", "rb") as f:
weights = tenso.load(f, mmap_mode=True) # Memory-mapped, not loaded into RAM
Use Cases
Perfect For
- Model Serving APIs - 23.8x faster deserialization saves CPU cores
- Distributed Training - Efficient gradient/activation passing (Ray, Spark)
- Real-time Robotics - Sub-millisecond latency sensor fusion
- High-Frequency Trading - Microsecond-precision data exchange
- Microservices - Fast tensor exchange between services
- Edge Devices - Minimal dependencies, pure Python
Consider Alternatives For
- Long-term Model Storage - Use SafeTensors (better ecosystem, HuggingFace integration)
- Multi-column Dataframes - Use Arrow (designed for tabular data)
- Arbitrary Python Objects - Use Pickle (if you trust the source)
Protocol Design
Tenso uses a minimalist 4-part structure:
┌─────────────┬──────────────┬──────────────┬────────────────────────┬──────────────┐
│ HEADER │ SHAPE │ PADDING │ BODY (Raw Data) │ FOOTER │
│ 8 bytes │ Variable │ 0-63 bytes │ C-Contiguous Array │ 8 bytes* │
└─────────────┴──────────────┴──────────────┴────────────────────────┴──────────────┘
(*Optional)
Header (8 bytes)
[4 bytes: Magic "TNSO"]
[1 byte: Protocol Version (2)]
[1 byte: Flags (Bit 0: Aligned, Bit 1: Integrity)]
[1 byte: Dtype Code]
[1 byte: Number of Dimensions]
Why This Is Fast
SafeTensors: Uses JSON header - 3.67ms parsing overhead
Arrow: Complex IPC format with schema validation - 0.805ms overhead
Tenso: Fixed 8-byte struct - 0.034ms (just unpack and memory map)
The padding ensures the data body starts at a 64-byte boundary, enabling:
- AVX-512 vectorization
- Zero-copy memory mapping
- Cache-line alignment
Advanced Features
Data Integrity (XXH3)
Protect your tensors against network corruption with ultra-fast XXH3 hashing (adds <2% overhead):
# Serialize with 64-bit checksum
packet = tenso.dumps(data, check_integrity=True)
# Verification is automatic during load
try:
restored = tenso.loads(packet)
except ValueError:
print("Detected corrupted data!")
Strict Mode
Prevents accidental memory copies:
# Force C-contiguous check
try:
packet = tenso.dumps(fortran_array, strict=True)
except ValueError:
print("Array must be C-contiguous!")
fortran_array = np.ascontiguousarray(fortran_array)
Packet Introspection
Inspect metadata without deserializing:
info = tenso.get_packet_info(packet)
print(f"Shape: {info['shape']}")
print(f"Dtype: {info['dtype']}")
print(f"Size: {info['data_size_bytes']} bytes")
Supported Dtypes
All NumPy numeric types including:
- Floats:
float16,float32,float64 - Integers:
int8,int16,int32,int64,uint8,uint16,uint32,uint64 - Complex:
complex64,complex128 - Boolean:
bool
Comparison Table
| Feature | Tenso | Arrow | SafeTensors | Pickle |
|---|---|---|---|---|
| Deserialize Speed (64MB) | 0.034ms | 0.805ms | 2.621ms | 3.293ms |
| CPU Usage | 0.6% | 1.1% | 36.7% | 41.7% |
| Memory Overhead | 0.00% | 0.00% | 0.00% | 0.00% |
| Security | Safe | Safe | Safe | RCE Risk |
| Dependencies | NumPy only | PyArrow (large) | Rust bindings | Python stdlib |
| Best For | Network/IPC | Dataframes | Disk storage | Python objects |
| SIMD Aligned | 64-byte | 64-byte | No | No |
Performance Deep-Dive
Read the full story: Breaking the Speed Limit: Optimizing Python Tensor Serialization to 5 GB/s
Key insights:
- Why JSON headers kill performance
- How memory alignment enables zero-copy
- Why Tenso beats Arrow for single tensors
- Real-world cost savings ($15k/month at scale)
Development
# Clone repository
git clone https://github.com/Khushiyant/tenso.git
cd tenso
# Install with dev dependencies
pip install -e ".[dev]"
# Install with gpu dependencies
pip install -e ".[gpu]"
# Run tests
pytest
# Run comprehensive benchmarks
python benchmark.py all
# Quick benchmark (serialization + Arrow comparison)
python benchmark.py quick
# Benchmark with Integrity
python benchmark.py all --integrity # Require installation with [integrity] dependicies
Requirements
- Python >= 3.10
- NumPy >= 1.20
Optional (for benchmarks):
-
pyarrow- Compare with Apache Arrow -
safetensors- Compare with SafeTensors -
msgpack- Compare with MessagePack -
psutil- Monitor CPU/memory usage -
xxhash- Integrity Checks Implementation
Contributing
Contributions welcome. Areas we'd love help with:
- Async support (
async def aread_stream()) - Compression integration (zstd, lz4)
- gRPC/FastAPI integration examples
- Rust bindings for even faster serialization
- JavaScript/WASM client for browser ML
- CUDA support for GPU-direct transfers
License
MIT License - see License file.
Citation
If you use Tenso in research, please cite:
@software{tenso2025,
author = {Khushiyant},
title = {Tenso: High-Performance Zero-Copy Tensor Protocol},
year = {2025},
url = {https://github.com/Khushiyant/tenso}
}
Acknowledgments
Inspired by the need for faster ML inference infrastructure. Built with care for the ML community.
Star this repo if Tenso saved you CPU cycles.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tenso-0.9.0.tar.gz.
File metadata
- Download URL: tenso-0.9.0.tar.gz
- Upload date:
- Size: 13.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dea32c207490241a4d0ff85696b7d7c0ef58aadfbcd0055f849987eba2ec6ffb
|
|
| MD5 |
58cec6bc3717ffad20516d6deca6dc33
|
|
| BLAKE2b-256 |
c8b1573bcb127e781fd54de8eb8e7885701b0f630e084793fc3894545f3e696a
|
Provenance
The following attestation bundles were made for tenso-0.9.0.tar.gz:
Publisher:
release.yml on Khushiyant/tenso
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tenso-0.9.0.tar.gz -
Subject digest:
dea32c207490241a4d0ff85696b7d7c0ef58aadfbcd0055f849987eba2ec6ffb - Sigstore transparency entry: 775094023
- Sigstore integration time:
-
Permalink:
Khushiyant/tenso@e0f3d2ae99939041c179557f8cea2b531e4ee224 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Khushiyant
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e0f3d2ae99939041c179557f8cea2b531e4ee224 -
Trigger Event:
push
-
Statement type:
File details
Details for the file tenso-0.9.0-py3-none-any.whl.
File metadata
- Download URL: tenso-0.9.0-py3-none-any.whl
- Upload date:
- Size: 16.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
882a1ef6ec3b88e6146ccd9b98555253f8c9a7f9633c7369300f4bf389e91b24
|
|
| MD5 |
635cb304ebc426a3c55af0dadd23b7fe
|
|
| BLAKE2b-256 |
7c2b10bec28ef84d0cb23043e96198af35871fa6e82f1e636aa3f2bab9a097c0
|
Provenance
The following attestation bundles were made for tenso-0.9.0-py3-none-any.whl:
Publisher:
release.yml on Khushiyant/tenso
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tenso-0.9.0-py3-none-any.whl -
Subject digest:
882a1ef6ec3b88e6146ccd9b98555253f8c9a7f9633c7369300f4bf389e91b24 - Sigstore transparency entry: 775094029
- Sigstore integration time:
-
Permalink:
Khushiyant/tenso@e0f3d2ae99939041c179557f8cea2b531e4ee224 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Khushiyant
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e0f3d2ae99939041c179557f8cea2b531e4ee224 -
Trigger Event:
push
-
Statement type: