Functions to manipulate batches of PyTorch tensors
Project description
batchtensor
Overview
batchtensor
is lightweight library built on top of PyTorch to manipulate
nested data structure with PyTorch tensors.
This library provides functions for tensors where the first dimension is the batch dimension.
It also provides functions for tensors representing a batch of sequences where the first dimension
is the batch dimension and the second dimension is the sequence dimension.
Motivation
Let's imagine you have a batch which is represented by a dictionary with three tensors, and you want
to take the first 2 items.
batchtensor
provides the function slice_along_batch
that allows to slide all the tensors:
>>> import torch
>>> from batchtensor.nested import slice_along_batch
>>> batch = {
... "a": torch.tensor([[2, 6], [0, 3], [4, 9], [8, 1], [5, 7]]),
... "b": torch.tensor([4, 3, 2, 1, 0]),
... "c": torch.tensor([1.0, 2.0, 3.0, 4.0, 5.0]),
... }
>>> slice_along_batch(batch, stop=2)
{'a': tensor([[2, 6], [0, 3]]), 'b': tensor([4, 3]), 'c': tensor([1., 2.])}
Similarly, it is possible to split a batch in multiple batches by using the
function split_along_batch
:
>>> import torch
>>> from batchtensor.nested import split_along_batch
>>> batch = {
... "a": torch.tensor([[2, 6], [0, 3], [4, 9], [8, 1], [5, 7]]),
... "b": torch.tensor([4, 3, 2, 1, 0]),
... "c": torch.tensor([1.0, 2.0, 3.0, 4.0, 5.0]),
... }
>>> split_along_batch(batch, split_size_or_sections=2)
({'a': tensor([[2, 6], [0, 3]]), 'b': tensor([4, 3]), 'c': tensor([1., 2.])},
{'a': tensor([[4, 9], [8, 1]]), 'b': tensor([2, 1]), 'c': tensor([3., 4.])},
{'a': tensor([[5, 7]]), 'b': tensor([0]), 'c': tensor([5.])})
Please check the documentation to see all the implemented functions.
Documentation
- latest (stable): documentation from the latest stable release.
- main (unstable): documentation associated to the main branch of the repo. This documentation may contain a lot of work-in-progress/outdated/missing parts.
Installation
We highly recommend installing
a virtual environment.
batchtensor
can be installed from pip using the following command:
pip install batchtensor
To make the package as slim as possible, only the minimal packages required to use batchtensor
are
installed.
To include all the dependencies, you can use the following command:
pip install batchtensor[all]
Please check the get started page to see how
to install only some specific dependencies or other alternatives to install the library.
The following is the corresponding batchtensor
versions and tested dependencies.
batchtensor |
coola |
torch |
python |
---|---|---|---|
main |
>=0.1,<0.4 |
>=1.11,<3.0 |
>=3.9,<3.13 |
0.0.1 |
>=0.1,<0.4 |
>=1.11,<3.0 |
>=3.9,<3.13 |
* indicates an optional dependency
Contributing
Please check the instructions in CONTRIBUTING.md.
Suggestions and Communication
Everyone is welcome to contribute to the community. If you have any questions or suggestions, you can submit Github Issues. We will reply to you as soon as possible. Thank you very much.
API stability
:warning: While batchtensor
is in development stage, no API is guaranteed to be stable from one
release to the next.
In fact, it is very likely that the API will change multiple times before a stable 1.0.0 release.
In practice, this means that upgrading batchtensor
to a new version will possibly break any code
that was using the old version of batchtensor
.
License
batchtensor
is licensed under BSD 3-Clause "New" or "Revised" license available
in LICENSE file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for batchtensor-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f9b436361fee0c6debe8b6be0bbb300b0367b3c3f16f66dc2fda64f4215d101a |
|
MD5 | 451cc9cae47ddecb4a3895924ad2451b |
|
BLAKE2b-256 | 3a4979766b775bbabddaad73df9a8369afe04cc5227875c94879e65328790cfb |