Efficient RaggedBuffer datatype that implements 3D arrays with variable-length 2nd dimension.
Project description
ENN Ragged Buffer
This Python package implements an efficient RaggedBuffer
datatype that is similar to
a 3D numpy array, but which allows for variable sequence length in the second
dimension. It was created primarily for use in enn-trainer
and currently only supports a small selection of the numpy array methods.
User Guide
Install the package with pip install ragged-buffer
.
The package currently supports three RaggedBuffer
variants, RaggedBufferF32
, RaggedBufferI64
, and RaggedBufferBool
.
Creating a RaggedBuffer
There are three ways to create a RaggedBuffer
:
RaggedBufferF32(features: int)
creates an emptyRaggedBuffer
with the specified number of features.RaggedBufferF32.from_flattened(flattened: np.ndarray, lenghts: np.ndarray)
creates aRaggedBuffer
from a flattened 2D numpy array and a 1D numpy array of lengths.RaggedBufferF32.from_array
creates aRaggedBuffer
(with equal sequence lenghts) from a 3D numpy array.
Creating an empty buffer and pushing each row:
import numpy as np
from ragged_buffer import RaggedBufferF32
# Create an empty RaggedBuffer with a feature size of 3
buffer = RaggedBufferF32(3)
# Push sequences with 3, 5, 0, and 1 elements
buffer.push(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.float32))
buffer.push(np.array([[10, 11, 12], [13, 14, 15], [16, 17, 18], [19, 20, 21], [22, 23, 24]], dtype=np.float32))
buffer.push(np.array([], dtype=np.float32)) # Alternative: `buffer.push_empty()`
buffer.push(np.array([[25, 25, 27]], dtype=np.float32))
Creating a RaggedBuffer from a flat 2D numpy array which combines the first and second dimension, and an array of sequence lengths:
import numpy as np
from ragged_buffer import RaggedBufferF32
buffer = RaggedBufferF32.from_flattened(
np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18], [19, 20, 21], [22, 23, 24], [25, 25, 27]], dtype=np.float32),
np.array([3, 5, 0, 1], dtype=np.int64))
)
Creating a RaggedBuffer from a 3D numpy array (all sequences have the same length):
import numpy as np
from ragged_buffer import RaggedBufferF32
buffer = RaggedBufferF32.from_array(np.zeros((4, 5, 3), dtype=np.float32))
Get size
The size0
, size1
, and size2
methods return the number of sequences, the number of elements in a sequence, and the number of features respectively.
import numpy as np
from ragged_buffer import RaggedBufferF32
buffer = RaggedBufferF32.from_flattened(
np.zeros((9, 64), dtype=np.float32),
np.array([3, 5, 0, 1], dtype=np.int64))
)
# Get size of the first/batch dimension.
assert buffer.size0() == 10
# Get size of individual sequences.
assert buffer.size1(1) == 5
assert buffer.size1(2) == 0
# Get size of the last/feature dimension.
assert buffer.size2() == 64
Convert to numpy array
as_aray
converts a RaggedBuffer
to a flat 2D numpy array that combines the first and second dimension.
import numpy as np
from ragged_buffer import RaggedBufferI64
buffer = RaggedBufferI64(1)
buffer.push(np.array([[1], [1], [1]], dtype=np.int64))
buffer.push(np.array([[2], [2]], dtype=np.int64))
assert np.all(buffer.as_array(), np.array([[1], [1], [1], [2], [2]], dtype=np.int64))
Indexing
You can index a RaggedBuffer
with a single integer (returning a RaggedBuffer
with a single sequence), or with a numpy array of integers selecting/permuting multiple sequences.
import numpy as np
from ragged_buffer import RaggedBufferF32
# Create a new `RaggedBufferF32`
buffer = RaggedBufferF32.from_flattened(
np.arange(0, 40, dtype=np.float32).reshape(10, 4),
np.array([3, 5, 0, 1], dtype=np.int64)
)
# Retrieve the first sequence.
assert np.all(
buffer[0].as_array() ==
np.array([[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], dtype=np.float32)
)
# Get a RaggedBatch with 2 randomly selected sequences.
buffer[np.random.permutation(4)[:2]]
Addition
You can add two RaggedBuffer
s with the +
operator if they have the same number of sequences, sequence lengths, and features. You can also add a RaggedBuffer
where all sequences have a length of 1 to a RaggedBuffer
with variable length sequences, broadcasting along each sequence.
import numpy as np
from ragged_buffer import RaggedBufferF32
# Create ragged buffer with dimensions (3, [1, 3, 2], 1)
rb3 = RaggedBufferI64(1)
rb3.push(np.array([[0]], dtype=np.int64))
rb3.push(np.array([[0], [1], [2]], dtype=np.int64))
rb3.push(np.array([[0], [5]], dtype=np.int64))
# Create ragged buffer with dimensions (3, [1, 1, 1], 1)
rb4 = RaggedBufferI64.from_array(np.array([0, 3, 10], dtype=np.int64).reshape(3, 1, 1))
# Add rb3 and rb4, broadcasting along the sequence dimension.
rb5 = rb3 + rb4
assert np.all(
rb5.as_array() == np.array([[0], [3], [4], [5], [10], [15]], dtype=np.int64)
)
Concatenation
The extend
method can be used to mutate a RaggedBuffer
by appending another RaggedBuffer
to it.
import numpy as np
from ragged_buffer import RaggedBufferF32
rb1 = RaggedBufferF32.from_array(np.zeros((4, 5, 3), dtype=np.float32))
rb2 = RaggedBufferF32.from_array(np.zeros((2, 5, 3), dtype=np.float32))
rb1.extend(r2)
assert rb1.size0() == 6
Clear
The clear
method removes all elements from a RaggedBuffer
without deallocating the underlying memory.
import numpy as np
from ragged_buffer import RaggedBufferF32
rb = RaggedBufferF32.from_array(np.zeros((4, 5, 3), dtype=np.float32))
rb.clear()
assert rb.size0() == 0
License
ENN Ragged Buffer dual-licensed under Apache-2.0 and MIT.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for ragged_buffer-0.4.8-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6cf37b7346dbd1be45e9f7214ac235eaf9140ceb5c4207ea0ad92e80f05d20e4 |
|
MD5 | 28734057a4c784b198e5f6678dff2375 |
|
BLAKE2b-256 | 5106895843e6c6b01e171ddf81c417e94607b6167d6f8734d73f940264d64dee |
Hashes for ragged_buffer-0.4.8-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 03637dfc82b1ae3c24e4af218b9b01ec1f65132f80eab3529ced470e336c8fa3 |
|
MD5 | 709055d9aa5b0cf3d86b1e7bf216d6c4 |
|
BLAKE2b-256 | e251767442b1787322e994a3ccc2579cd64e8dc8ceac722fcf495d9528f06b8f |
Hashes for ragged_buffer-0.4.8-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1f7bdc6ce525ba16b8d536fb9d54ca91f340371f4ce0912c584cd9c92c9b9717 |
|
MD5 | a9258036cc94409518ea170e19b5716e |
|
BLAKE2b-256 | 874e6a11770d7c27bf08dea7c6b535b0c4f8c7379bf44f19f4508d797864362e |
Hashes for ragged_buffer-0.4.8-cp311-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a471a3a3852da9d36ee996044976bb4ee92857810c21635140f5abe0299892f1 |
|
MD5 | a9605fe0ac486b809761f94fbef04182 |
|
BLAKE2b-256 | 42688cd1d209a1f7895dcb874c5eae6773b11b74d7bf38c31b47236d183f226d |
Hashes for ragged_buffer-0.4.8-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 845ae5961c4798ee9a5905b1f8e7dd300a5855666263b7e91e2ecd9fd4a77357 |
|
MD5 | 3a4e608829caa58ccc9c97a49e22cca3 |
|
BLAKE2b-256 | 4a46b5c26b7d64901e2396e1dbf3ca2741cc559d301128496d8ac4d5a1197c8b |
Hashes for ragged_buffer-0.4.8-cp311-cp311-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | abc246ba0d95c951bd4d6563b19bc937c9da8f682d609c0eed6ec77a0da748a0 |
|
MD5 | 1d85ad652a795b8b3ecfe7a33eda9d8f |
|
BLAKE2b-256 | 55e5d9d71ef1509dcbc3dfd080094be68604b6f9f778e5f547b4f26391b8fcc8 |
Hashes for ragged_buffer-0.4.8-cp310-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5d4cf1b5d20dd883cc23dc9e888ec270fca674aa3339981dfea07ff1cd99cd4d |
|
MD5 | 70732e4bf6300debdbc6b31464166ecb |
|
BLAKE2b-256 | 1504d36d8d3c4f8d784ace022a0c37b246aee8abc41de08730a161da66362f41 |
Hashes for ragged_buffer-0.4.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2c73b91d7051703a9f6b41748f8df41ad20098054266a8a09273bf0fcc50f685 |
|
MD5 | be8b54dba1d21e3d0a281f748b8bc621 |
|
BLAKE2b-256 | cb099a685dbbf8c8a054760293bdb087e70e1c5f59a76e6544d7441cdf28175e |
Hashes for ragged_buffer-0.4.8-cp310-cp310-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a93b17ffb074cf27b759224bb24f04ed0708f3f484e945b2d080be10a2411a44 |
|
MD5 | 50dc843f15139b2e2edc0cda0a150fb5 |
|
BLAKE2b-256 | a0bf0028e2b832356fc713aa0c5ea504d2b627173ce6a1820d0b595d7e40c004 |
Hashes for ragged_buffer-0.4.8-cp39-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ce5caecce4bb8d4ba2b5197b9726c8987818eb731a522814519acd3e743c5dbd |
|
MD5 | 20ce567136e517a530798ec5158e6c18 |
|
BLAKE2b-256 | 8c752fda0af15d26e5424c8c3c741a55578a9beadae20cc25cab1e47e5ec494d |
Hashes for ragged_buffer-0.4.8-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 36184151040a2c1b8d9d03eb9c916e7c34eee6f9d6927f199fe96200a7061fd6 |
|
MD5 | 0f4bc1d3ab567ec8b070ab1a39098f95 |
|
BLAKE2b-256 | 28ad1b74e66f11e4d18c799ca7b6bd1c1bd5cbef8334dd3159f68f414e1416cd |
Hashes for ragged_buffer-0.4.8-cp39-cp39-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7359d8e148737963aefd40e8cfc500e067b5619b2931e3ace511d658cc29c994 |
|
MD5 | b81bc389d37249c3095f54299ad09b5c |
|
BLAKE2b-256 | b2ebf968570cf6f9a477c1bf8142688e8c7854a35c5d6a7db76f2571060af4cf |
Hashes for ragged_buffer-0.4.8-cp38-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f09da81eea400659eae3bf44e0e7715a0c8f56b2e5c097de565a06ff2c564944 |
|
MD5 | 71db21eb3127aa92f642c66727f55bb7 |
|
BLAKE2b-256 | b382e6b33adf5d7fb5285f3f081dfeec1fe75cbab76608ac5d42d3c93f48f538 |
Hashes for ragged_buffer-0.4.8-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c7e36f5ea7ee2daa89565556c3794c51a0a8eb37b7139b00a3b93519b5a2e83c |
|
MD5 | b30e01680b31387a8196425d0b655441 |
|
BLAKE2b-256 | 15cee8831e7666bbb5c721df6a26ecec3905470c5fa2e82e77e166d5c62d01a8 |
Hashes for ragged_buffer-0.4.8-cp38-cp38-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0bc21f55c09075e39225e7100daafab8e07c449f71578b53c86086ccd8cd209c |
|
MD5 | d32f23833f93d08bc2a605c95c9850b1 |
|
BLAKE2b-256 | e47f68f8fccb647e3c5d8468bb3b0aee31a646a616eb4f7380688ee97e8ea506 |
Hashes for ragged_buffer-0.4.8-cp37-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 07cf8f4c1ce9f42043252f73bbbd3ef539933744e54adac05101217e6744f642 |
|
MD5 | 6c5a35032565f510b7d7557a6afd2140 |
|
BLAKE2b-256 | 3db0d3549bd69efc17fe6e195f0a080df8de5c034b9d992ecf9e677b91aacae7 |
Hashes for ragged_buffer-0.4.8-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d603a59e6c5693e9b685c194349bd9fd153779ec8c8c9413ab4eae69e4d91e98 |
|
MD5 | be2f28f3f7ff611dbb6f9ae7fb128c70 |
|
BLAKE2b-256 | 0f7f02875144e0e142d5940260f33a395007823d89b3abdfc5ff5084fbc319c6 |
Hashes for ragged_buffer-0.4.8-cp37-cp37m-macosx_10_7_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad7c9ab852367efea72ddb16a1da839d6c60929a81c9d5a776a060fa3cad0c1d |
|
MD5 | b9d2cd5d1b98e0d9915efa2043a88ae3 |
|
BLAKE2b-256 | 834589dd8e859c0eed0e2840c737ef29ebf83dfedf81ca2a1b67f815bab26c57 |