Skip to main content

Safe, fast, pickle-free tensor storage for PyTorch. Rust core, Python interface. By Death Legion.

Project description

deathtensors

Safe, fast, pickle-free tensor storage for PyTorch. By Death Legion.

deathtensors is a real alternative to pickle for storing model weights. Files are designed to be opened safely even when they come from an untrusted source: opening a deathtensors file never executes arbitrary code, because the header is parsed as JSON (no eval, no __reduce__, no torch.load) and the tensor data blob is treated as opaque bytes.

  • Rust core, Python interface. The file format and I/O are implemented in Rust; PyO3 bindings expose a Pythonic API.
  • Lazy loading. Open a file in O(1), list tensors, read metadata, and load only the tensors you need — without pulling the whole file into memory. The reader is memory-mapped, so even loading a tensor page-faults only the bytes you actually touch.
  • 15 dtypes. BOOL, U8/I8/U16/I16/F16/BF16/I32/U32/F32/I64/U64/F64, and complex C64/C128.
  • Per-tensor and global string metadata. Tag each tensor with layer names, training provenance, licenses, etc., and tag the file itself with model name, framework version, etc.
  • Optional SHA-256 footer. Verify file integrity with deathtensors.open(path, verify=True).
  • Atomic writes. save() writes to a temp file and renames into place — a crash never leaves a half-written file visible to readers.
  • Deterministic output. Tensor insertion order is preserved, so two saves of the same dict produce byte-identical files (useful for reproducible research builds and git-tracked weights).

Install

pip install deathtensors            # core only
pip install deathtensors[torch]     # pulls in torch
pip install deathtensors[numpy]     # pulls in numpy
pip install deathtensors[dev]       # torch + numpy + pytest

Pre-built wheels are available for CPython 3.8–3.13 on x86_64 Linux. Other platforms fall back to a source build (requires Rust ≥ 1.74).

Quickstart

import torch
import deathtensors as dt

# 1. Save a couple of tensors to one file.
tensors = {
    "weight": torch.randn(128, 128),
    "bias":   torch.zeros(128),
}
metadata = {
    "weight": {"layer": "fc1", "init": "kaiming"},
}
global_md = {"model": "mlp-tiny", "license": "MIT"}
dt.save("model.dt", tensors, metadata=metadata,
        global_metadata=global_md, checksum=True)

# 2. Open the file lazily — no tensors are read yet.
with dt.open("model.dt", verify=True) as f:
    print(f.keys())                          # ['weight', 'bias']
    print(f.metadata())                      # {'model': 'mlp-tiny', ...}
    print(f.info("weight"))                  # full dtype/shape/offsets/metadata
    w = f.get_tensor("weight")               # only 'weight' is read
    print(w.shape, w.dtype)                  # torch.Size([128, 128]) torch.float32

File format (v1)

+-----------------------+
| Magic (8 bytes)       |   b"DTLEGION"
+-----------------------+
| Version (4 bytes u32) |   1 (little-endian)
+-----------------------+
| Flags (4 bytes u32)   |   bit0: zstd (reserved)
|                       |   bit1: SHA-256 footer
|                       |   bit2: encryption (reserved)
+-----------------------+
| Header size (8 u64)   |   byte length of JSON header
+-----------------------+
| Header (JSON, UTF-8)  |   see below
+-----------------------+
| Padding (0..8 bytes)  |   NUL bytes, 8-byte alignment
+-----------------------+
| Tensor data (blob)    |   raw bytes, offsets are relative to here
+-----------------------+
| Footer (32 bytes)     |   optional: SHA-256(header + padding + data)
+-----------------------+

Header JSON schema:

{
  "format": "deathtensors",
  "format_version": 1,
  "created_by": "deathtensors 0.1.0 (death legion)",
  "global_metadata": {"model": "mlp-tiny", "license": "MIT"},
  "tensors": {
    "weight": {
      "dtype": "F32",
      "shape": [128, 128],
      "data_offsets": [0, 65536],
      "metadata": {"layer": "fc1", "init": "kaiming"}
    },
    "bias": {
      "dtype": "F32",
      "shape": [128],
      "data_offsets": [65536, 65536 + 512],
      "metadata": {}
    }
  }
}

data_offsets are [start, end) byte offsets relative to the start of the tensor data blob (after the alignment padding), not to the start of the file. This lets the reader memory-map the blob and slice tensors out without translating offsets.

Why not just use pickle / torch.save?

torch.save uses pickle under the hood, which means opening a .pt file from an untrusted source can run arbitrary Python code. This has been the cause of several real-world supply-chain attacks on ML model hubs. deathtensors files are pure data: a fixed binary prefix followed by JSON metadata followed by raw tensor bytes. There is no code path in the reader that calls eval, exec, __reduce__, or any pickle-style reconstruction.

Why not just use safetensors?

safetensors is excellent and we encourage you to use it. deathtensors exists as a separate, independent implementation because:

  1. Format diversity is good for the ecosystem. A single point of failure in any one tensor-storage library would be bad; having two interoperable libraries with different code paths reduces risk.
  2. deathtensors ships an optional SHA-256 footer for integrity verification, which is useful when files travel through untrusted channels.
  3. deathtensors ships per-tensor string metadata in addition to global metadata, which safetensors only added later.
  4. deathtensors exposes a richer dtype set out of the box, including BF16, complex64, complex128, and unsigned 16/32/64-bit integers.

We do not try to be a drop-in replacement. The Python API is similar in spirit (save, open, keys, get_tensor) but the file format is not compatible — a .dt file is not a .safetensors file and vice versa.

Public API

deathtensors.save(path, tensors, metadata=None, global_metadata=None, checksum=False)
deathtensors.open(path, verify=False)  # context manager
deathtensors.save_file(path, tensors, global_metadata=None, checksum=False)  # lower-level
deathtensors.DtFile(path, verify=False)  # the class returned by open()

f = deathtensors.open("model.dt")
f.keys()                   # list of tensor names
f.info(name)               # dict: dtype, shape, data_offsets, nbytes, metadata
f.metadata()               # global file metadata dict
f.get_bytes(name)          # raw bytes
f.get_buffer(name)         # memoryview of raw bytes
f.get_tensor(name, framework="torch")   # torch.Tensor (default) or numpy.ndarray
f.get_tensors(framework="torch")        # dict of all tensors
f.verify()                 # verify SHA-256 footer (if any); returns bool
f.has_checksum()           # was the file written with checksum=True?

Testing

pip install deathtensors[dev]
pytest tests/

License

MIT, © Death Legion.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deathtensors-0.1.0.tar.gz (46.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deathtensors-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl (292.5 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

File details

Details for the file deathtensors-0.1.0.tar.gz.

File metadata

  • Download URL: deathtensors-0.1.0.tar.gz
  • Upload date:
  • Size: 46.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for deathtensors-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3b705b2561ef4db281b43598d7237c9d865102e456f137f00d68a2e717257f33
MD5 adf0827ecd04b8f7804ca3b9c4e7487a
BLAKE2b-256 9d43d15e4838e58ee554d02a2840fcb35590a8ec5ce1d83b798008cdefdcf89f

See more details on using hashes here.

File details

Details for the file deathtensors-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for deathtensors-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 d751e06e3fa491ac6fff93de84604ef38e59cbaf97e73c05b743f0966818aa20
MD5 40aac53eed179bd55f63684096943fae
BLAKE2b-256 958a7140f62ce61de1c2b30d875b8c73655b3f159a42020acfa89f68edfac5ca

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page