Safe, fast, pickle-free tensor storage for PyTorch. Rust core, Python interface. By Death Legion.
Project description
deathtensors
Safe, fast, pickle-free tensor storage for PyTorch. By Death Legion.
deathtensors is a real alternative to pickle for storing model weights.
Files are designed to be opened safely even when they come from an
untrusted source: opening a deathtensors file never executes
arbitrary code, because the header is parsed as JSON (no eval, no
__reduce__, no torch.load) and the tensor data blob is treated as
opaque bytes.
- Rust core, Python interface. The file format and I/O are implemented in Rust; PyO3 bindings expose a Pythonic API.
- Lazy loading. Open a file in O(1), list tensors, read metadata, and load only the tensors you need — without pulling the whole file into memory. The reader is memory-mapped, so even loading a tensor page-faults only the bytes you actually touch.
- 15 dtypes. BOOL, U8/I8/U16/I16/F16/BF16/I32/U32/F32/I64/U64/F64, and complex C64/C128.
- Per-tensor and global string metadata. Tag each tensor with layer names, training provenance, licenses, etc., and tag the file itself with model name, framework version, etc.
- Optional SHA-256 footer. Verify file integrity with
deathtensors.open(path, verify=True). - Atomic writes.
save()writes to a temp file and renames into place — a crash never leaves a half-written file visible to readers. - Deterministic output. Tensor insertion order is preserved, so two saves of the same dict produce byte-identical files (useful for reproducible research builds and git-tracked weights).
Install
pip install deathtensors # core only
pip install deathtensors[torch] # pulls in torch
pip install deathtensors[numpy] # pulls in numpy
pip install deathtensors[dev] # torch + numpy + pytest
Pre-built wheels are available for CPython 3.8–3.13 on x86_64 Linux. Other platforms fall back to a source build (requires Rust ≥ 1.74).
Quickstart
import torch
import deathtensors as dt
# 1. Save a couple of tensors to one file.
tensors = {
"weight": torch.randn(128, 128),
"bias": torch.zeros(128),
}
metadata = {
"weight": {"layer": "fc1", "init": "kaiming"},
}
global_md = {"model": "mlp-tiny", "license": "MIT"}
dt.save("model.dt", tensors, metadata=metadata,
global_metadata=global_md, checksum=True)
# 2. Open the file lazily — no tensors are read yet.
with dt.open("model.dt", verify=True) as f:
print(f.keys()) # ['weight', 'bias']
print(f.metadata()) # {'model': 'mlp-tiny', ...}
print(f.info("weight")) # full dtype/shape/offsets/metadata
w = f.get_tensor("weight") # only 'weight' is read
print(w.shape, w.dtype) # torch.Size([128, 128]) torch.float32
File format (v1)
+-----------------------+
| Magic (8 bytes) | b"DTLEGION"
+-----------------------+
| Version (4 bytes u32) | 1 (little-endian)
+-----------------------+
| Flags (4 bytes u32) | bit0: zstd (reserved)
| | bit1: SHA-256 footer
| | bit2: encryption (reserved)
+-----------------------+
| Header size (8 u64) | byte length of JSON header
+-----------------------+
| Header (JSON, UTF-8) | see below
+-----------------------+
| Padding (0..8 bytes) | NUL bytes, 8-byte alignment
+-----------------------+
| Tensor data (blob) | raw bytes, offsets are relative to here
+-----------------------+
| Footer (32 bytes) | optional: SHA-256(header + padding + data)
+-----------------------+
Header JSON schema:
{
"format": "deathtensors",
"format_version": 1,
"created_by": "deathtensors 0.1.0 (death legion)",
"global_metadata": {"model": "mlp-tiny", "license": "MIT"},
"tensors": {
"weight": {
"dtype": "F32",
"shape": [128, 128],
"data_offsets": [0, 65536],
"metadata": {"layer": "fc1", "init": "kaiming"}
},
"bias": {
"dtype": "F32",
"shape": [128],
"data_offsets": [65536, 65536 + 512],
"metadata": {}
}
}
}
data_offsets are [start, end) byte offsets relative to the start of
the tensor data blob (after the alignment padding), not to the start of
the file. This lets the reader memory-map the blob and slice tensors
out without translating offsets.
Why not just use pickle / torch.save?
torch.save uses pickle under the hood, which means opening a
.pt file from an untrusted source can run arbitrary Python code.
This has been the cause of several real-world supply-chain attacks on
ML model hubs. deathtensors files are pure data: a fixed binary
prefix followed by JSON metadata followed by raw tensor bytes. There is
no code path in the reader that calls eval, exec, __reduce__, or
any pickle-style reconstruction.
Why not just use safetensors?
safetensors is excellent and we encourage you to use it. deathtensors
exists as a separate, independent implementation because:
- Format diversity is good for the ecosystem. A single point of failure in any one tensor-storage library would be bad; having two interoperable libraries with different code paths reduces risk.
deathtensorsships an optional SHA-256 footer for integrity verification, which is useful when files travel through untrusted channels.deathtensorsships per-tensor string metadata in addition to global metadata, whichsafetensorsonly added later.deathtensorsexposes a richer dtype set out of the box, including BF16, complex64, complex128, and unsigned 16/32/64-bit integers.
We do not try to be a drop-in replacement. The Python API is similar
in spirit (save, open, keys, get_tensor) but the file format
is not compatible — a .dt file is not a .safetensors file and
vice versa.
Public API
deathtensors.save(path, tensors, metadata=None, global_metadata=None, checksum=False)
deathtensors.open(path, verify=False) # context manager
deathtensors.save_file(path, tensors, global_metadata=None, checksum=False) # lower-level
deathtensors.DtFile(path, verify=False) # the class returned by open()
f = deathtensors.open("model.dt")
f.keys() # list of tensor names
f.info(name) # dict: dtype, shape, data_offsets, nbytes, metadata
f.metadata() # global file metadata dict
f.get_bytes(name) # raw bytes
f.get_buffer(name) # memoryview of raw bytes
f.get_tensor(name, framework="torch") # torch.Tensor (default) or numpy.ndarray
f.get_tensors(framework="torch") # dict of all tensors
f.verify() # verify SHA-256 footer (if any); returns bool
f.has_checksum() # was the file written with checksum=True?
Testing
pip install deathtensors[dev]
pytest tests/
License
MIT, © Death Legion.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deathtensors-0.1.0.tar.gz.
File metadata
- Download URL: deathtensors-0.1.0.tar.gz
- Upload date:
- Size: 46.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b705b2561ef4db281b43598d7237c9d865102e456f137f00d68a2e717257f33
|
|
| MD5 |
adf0827ecd04b8f7804ca3b9c4e7487a
|
|
| BLAKE2b-256 |
9d43d15e4838e58ee554d02a2840fcb35590a8ec5ce1d83b798008cdefdcf89f
|
File details
Details for the file deathtensors-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: deathtensors-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 292.5 kB
- Tags: CPython 3.12, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d751e06e3fa491ac6fff93de84604ef38e59cbaf97e73c05b743f0966818aa20
|
|
| MD5 |
40aac53eed179bd55f63684096943fae
|
|
| BLAKE2b-256 |
958a7140f62ce61de1c2b30d875b8c73655b3f159a42020acfa89f68edfac5ca
|