Skip to main content

Pure-Python safetensors

Project description

pure_safetensors

Safetensors library but in pure clean Python. Run it on PyPy or IronPython or wherever.

Dependencies

We try to keep dependencies light:

  • attrs dataclass library (2881 LoC)
  • marshmallow serialization and validation library (2647 LoC)
  • sortedcollections tiny sorted collections library (339 LoC) built on top of sortedcontainers (1493 LoC)
  • (optional) sparsefile sparse file library (191 LoC)
  • (optional) fickle whitelist-based firewall for safe pickle loading, used for PyTorch model conversion (926 LoC)

Optionally, this library integrates with NumPy (if available). PyTorch integration is planned, someday.

To run the tests, you'll need pytest, numpy, and optionally hypothesis.

Examples

from pure_safetensors import SafeTensors

with SafeTensors("/path/to/example.safetensors", "r+") as sf:
    arrays = sf.as_numpy()
    arrays["hello"][3, :] += 420.69
    arrays["world"] = arrays["hello"][0:2] * 10

    # assign multiple arrays! much faster!
    arrays.update(
        {
            "q": my_array_1,
            "k": my_array_2,
            "v": my_array_3,
        }
    )

    # delete arrays! such wonders!
    del arrays["v"]

Conversion

Do you have an existing PyTorch checkpoint model that you would like to convert to safetensors? Then try running:

python3 -m pure_safetensors import-pytorch /path/to/model.ckpt /path/to/model.safetensors

You may also be able to convert the model in-place without making an additional copy! Use this facility at your own risk (or make a zero-cost copy-on-write backup copy of your checkpoint file using cp --reflink=always ... if your filesystem supports it).

# convert it
python3 -m pure_safetensors import-pytorch-inplace /data/model.ckpt

# rename
mv /data/model.ckpt /data/model.safetensors

Bugs

The space allocator is a greedy algorithm based on first-fit-decreasing bin packing. So if you add/remove tensors to an existing file, it may leave too much empty space behind.

PyTorch support isn't implemented yet.

Alternatives

pure_safetensors safetensors pure_torch.py safetensors.cpp
Written in pure Python?
Supports NumPy (without PyTorch)?
Supports PyTorch?
Can work without numpy or pytorch?
Can write safetensors files?
Can modify file in-place to add/remove tensors?
Has test suite?
Stable API? 🤷
Automatically makes files sparse to save space?
Works on platforms without mmap?

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pure_safetensors-0.3.2.tar.gz (19.4 kB view hashes)

Uploaded Source

Built Distribution

pure_safetensors-0.3.2-py3-none-any.whl (18.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page