Skip to main content

Efficient RaggedBuffer datatype that implements 3D arrays with variable-length 2nd dimension.

Project description

ENN Ragged Buffer

Actions Status PyPI Discord

This Python package implements an efficient RaggedBuffer datatype that is similar to a 3D numpy array, but which allows for variable sequence length in the second dimension. It was created primarily for use in enn-trainer and currently only supports a small selection of the numpy array methods.

Ragged Buffer

User Guide

Install the package with pip install ragged-buffer. The package currently supports three RaggedBuffer variants, RaggedBufferF32, RaggedBufferI64, and RaggedBufferBool.

Creating a RaggedBuffer

There are three ways to create a RaggedBuffer:

  • RaggedBufferF32(features: int) creates an empty RaggedBuffer with the specified number of features.
  • RaggedBufferF32.from_flattened(flattened: np.ndarray, lenghts: np.ndarray) creates a RaggedBuffer from a flattened 2D numpy array and a 1D numpy array of lengths.
  • RaggedBufferF32.from_array creates a RaggedBuffer (with equal sequence lenghts) from a 3D numpy array.

Creating an empty buffer and pushing each row:

import numpy as np
from ragged_buffer import RaggedBufferF32

# Create an empty RaggedBuffer with a feature size of 3
buffer = RaggedBufferF32(3)
# Push sequences with 3, 5, 0, and 1 elements
buffer.push(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.float32))
buffer.push(np.array([[10, 11, 12], [13, 14, 15], [16, 17, 18], [19, 20, 21], [22, 23, 24]], dtype=np.float32))
buffer.push(np.array([], dtype=np.float32))  # Alternative: `buffer.push_empty()`
buffer.push(np.array([[25, 25, 27]], dtype=np.float32))

Creating a RaggedBuffer from a flat 2D numpy array which combines the first and second dimension, and an array of sequence lengths:

import numpy as np
from ragged_buffer import RaggedBufferF32

buffer = RaggedBufferF32.from_flattened(
    np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18], [19, 20, 21], [22, 23, 24], [25, 25, 27]], dtype=np.float32),
    np.array([3, 5, 0, 1], dtype=np.int64))
)

Creating a RaggedBuffer from a 3D numpy array (all sequences have the same length):

import numpy as np
from ragged_buffer import RaggedBufferF32

buffer = RaggedBufferF32.from_array(np.zeros((4, 5, 3), dtype=np.float32))

Get size

The size0, size1, and size2 methods return the number of sequences, the number of elements in a sequence, and the number of features respectively.

import numpy as np
from ragged_buffer import RaggedBufferF32

buffer = RaggedBufferF32.from_flattened(
    np.zeros((9, 64), dtype=np.float32),
    np.array([3, 5, 0, 1], dtype=np.int64))
)

# Get size of the first/batch dimension.
assert buffer.size0() == 10
# Get size of individual sequences.
assert buffer.size1(1) == 5
assert buffer.size1(2) == 0
# Get size of the last/feature dimension.
assert buffer.size2() == 64

Convert to numpy array

as_aray converts a RaggedBuffer to a flat 2D numpy array that combines the first and second dimension.

import numpy as np
from ragged_buffer import RaggedBufferI64

buffer = RaggedBufferI64(1)
buffer.push(np.array([[1], [1], [1]], dtype=np.int64))
buffer.push(np.array([[2], [2]], dtype=np.int64))
assert np.all(buffer.as_array(), np.array([[1], [1], [1], [2], [2]], dtype=np.int64))

Indexing

You can index a RaggedBuffer with a single integer (returning a RaggedBuffer with a single sequence), or with a numpy array of integers selecting/permuting multiple sequences.

import numpy as np
from ragged_buffer import RaggedBufferF32

# Create a new `RaggedBufferF32`
buffer = RaggedBufferF32.from_flattened(
    np.arange(0, 40, dtype=np.float32).reshape(10, 4),
    np.array([3, 5, 0, 1], dtype=np.int64)
)

# Retrieve the first sequence.
assert np.all(
    buffer[0].as_array() ==
    np.array([[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], dtype=np.float32)
)

# Get a RaggedBatch with 2 randomly selected sequences.
buffer[np.random.permutation(4)[:2]]

Addition

You can add two RaggedBuffers with the + operator if they have the same number of sequences, sequence lengths, and features. You can also add a RaggedBuffer where all sequences have a length of 1 to a RaggedBuffer with variable length sequences, broadcasting along each sequence.

import numpy as np
from ragged_buffer import RaggedBufferF32

# Create ragged buffer with dimensions (3, [1, 3, 2], 1)
rb3 = RaggedBufferI64(1)
rb3.push(np.array([[0]], dtype=np.int64))
rb3.push(np.array([[0], [1], [2]], dtype=np.int64))
rb3.push(np.array([[0], [5]], dtype=np.int64))

# Create ragged buffer with dimensions (3, [1, 1, 1], 1)
rb4 = RaggedBufferI64.from_array(np.array([0, 3, 10], dtype=np.int64).reshape(3, 1, 1))

# Add rb3 and rb4, broadcasting along the sequence dimension.
rb5 = rb3 + rb4
assert np.all(
    rb5.as_array() == np.array([[0], [3], [4], [5], [10], [15]], dtype=np.int64)
)

Concatenation

The extend method can be used to mutate a RaggedBuffer by appending another RaggedBuffer to it.

import numpy as np
from ragged_buffer import RaggedBufferF32


rb1 = RaggedBufferF32.from_array(np.zeros((4, 5, 3), dtype=np.float32))
rb2 = RaggedBufferF32.from_array(np.zeros((2, 5, 3), dtype=np.float32))
rb1.extend(r2)
assert rb1.size0() == 6

Clear

The clear method removes all elements from a RaggedBuffer without deallocating the underlying memory.

import numpy as np
from ragged_buffer import RaggedBufferF32

rb = RaggedBufferF32.from_array(np.zeros((4, 5, 3), dtype=np.float32))
rb.clear()
assert rb.size0() == 0

License

ENN Ragged Buffer dual-licensed under Apache-2.0 and MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragged_buffer-0.4.2.tar.gz (23.0 kB view details)

Uploaded Source

Built Distributions

ragged_buffer-0.4.2-pp38-pypy38_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.whl (346.5 kB view details)

Uploaded PyPy manylinux: glibc 2.5+ x86-64

ragged_buffer-0.4.2-pp37-pypy37_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.whl (348.4 kB view details)

Uploaded PyPy manylinux: glibc 2.5+ x86-64

ragged_buffer-0.4.2-cp310-none-win_amd64.whl (274.9 kB view details)

Uploaded CPython 3.10 Windows x86-64

ragged_buffer-0.4.2-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.whl (346.6 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.5+ x86-64

ragged_buffer-0.4.2-cp39-none-win_amd64.whl (275.0 kB view details)

Uploaded CPython 3.9 Windows x86-64

ragged_buffer-0.4.2-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl (346.8 kB view details)

Uploaded CPython 3.9 manylinux: glibc 2.5+ x86-64

ragged_buffer-0.4.2-cp39-cp39-macosx_10_7_x86_64.whl (316.6 kB view details)

Uploaded CPython 3.9 macOS 10.7+ x86-64

ragged_buffer-0.4.2-cp38-none-win_amd64.whl (274.6 kB view details)

Uploaded CPython 3.8 Windows x86-64

ragged_buffer-0.4.2-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (346.8 kB view details)

Uploaded CPython 3.8 manylinux: glibc 2.5+ x86-64

ragged_buffer-0.4.2-cp38-cp38-macosx_10_7_x86_64.whl (316.5 kB view details)

Uploaded CPython 3.8 macOS 10.7+ x86-64

ragged_buffer-0.4.2-cp37-none-win_amd64.whl (274.5 kB view details)

Uploaded CPython 3.7 Windows x86-64

ragged_buffer-0.4.2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (346.8 kB view details)

Uploaded CPython 3.7m manylinux: glibc 2.5+ x86-64

ragged_buffer-0.4.2-cp37-cp37m-macosx_10_7_x86_64.whl (316.6 kB view details)

Uploaded CPython 3.7m macOS 10.7+ x86-64

File details

Details for the file ragged_buffer-0.4.2.tar.gz.

File metadata

  • Download URL: ragged_buffer-0.4.2.tar.gz
  • Upload date:
  • Size: 23.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/0.13.1

File hashes

Hashes for ragged_buffer-0.4.2.tar.gz
Algorithm Hash digest
SHA256 1ea05fbb72fca9ad17d15b0e369d62324fa36afea126a9c29d1d9271da9f7604
MD5 623766ccab632d6dd19cfb86d61ada69
BLAKE2b-256 3a8de318012d99c456ed6c0f6cc15e7e9189462cf94304be1d9e935b7194e2e2

See more details on using hashes here.

File details

Details for the file ragged_buffer-0.4.2-pp38-pypy38_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for ragged_buffer-0.4.2-pp38-pypy38_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 08806d5e992cce3c9433bae96c80253be4effb7d33c2c64d01c76e19da9710b4
MD5 ac575dfee748ecc41e6143137d6e2f0d
BLAKE2b-256 c40760b71d8aa77d031e27ef0841055092212b4d1825e187173aa1bf2eedaee9

See more details on using hashes here.

File details

Details for the file ragged_buffer-0.4.2-pp37-pypy37_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for ragged_buffer-0.4.2-pp37-pypy37_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 4ceaeab7202452ef4c9b7a0729869149e8b555e8cebd73efd2ad68f5f6bf4dca
MD5 92862db2b05a519ce4591382d4a3de11
BLAKE2b-256 118870cd5d4f7c24b907cb82c5446d0eec803fbff1637c1e8b967fa63019a0b2

See more details on using hashes here.

File details

Details for the file ragged_buffer-0.4.2-cp310-none-win_amd64.whl.

File metadata

File hashes

Hashes for ragged_buffer-0.4.2-cp310-none-win_amd64.whl
Algorithm Hash digest
SHA256 abd145607ea7befafd12f57194d1ebd2ac0342d9314264653df364ef1231b92a
MD5 112326615e11d0257920515c13110371
BLAKE2b-256 4487e544f8cad6aa24e428f06c3e965d570c6b100cda1879ef5778c62e727440

See more details on using hashes here.

File details

Details for the file ragged_buffer-0.4.2-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for ragged_buffer-0.4.2-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 3ea433b26bf4280421fe393f994e7eabd4230db6dc0aebae2931a93b9cfc6313
MD5 9f5b7a37d90e617c364f56884142c7bf
BLAKE2b-256 be8ae1f1ce356bd168480895a202b0de6f6cff7e62f3750a14dd0de6f5385a58

See more details on using hashes here.

File details

Details for the file ragged_buffer-0.4.2-cp39-none-win_amd64.whl.

File metadata

File hashes

Hashes for ragged_buffer-0.4.2-cp39-none-win_amd64.whl
Algorithm Hash digest
SHA256 ce0736db4687875d311e3f6b8021ee68a8714ccf78d975b533a91caaf884594d
MD5 f57735765785af9a2c6263770e414af4
BLAKE2b-256 a27d7bc93d49edc5109a6a361581100b4246352ad0cd6d5dd2680c2f0382f9af

See more details on using hashes here.

File details

Details for the file ragged_buffer-0.4.2-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for ragged_buffer-0.4.2-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 c67ded86e1f3f19d438f4aa3f37c7eed3a419e32af3c0d331ecbeb45f3c5dd2d
MD5 11133e9dc315ad7c4bfa5ac37518922a
BLAKE2b-256 b774d37a2da3c4b65612ebd4a6dc7166797333f4260fc98c62bd103d46979410

See more details on using hashes here.

File details

Details for the file ragged_buffer-0.4.2-cp39-cp39-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for ragged_buffer-0.4.2-cp39-cp39-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 4825762dfc2c231c35c695a7b05570cafc81c63db286fd6760ea4e4e12aeb1ae
MD5 ba1c8fa293bd882b6eb8cc6cb41b7d1c
BLAKE2b-256 0e744705a5948acef284bc4f8a4e14f301f1ff60a07d8ac4e827c93937f2a960

See more details on using hashes here.

File details

Details for the file ragged_buffer-0.4.2-cp38-none-win_amd64.whl.

File metadata

File hashes

Hashes for ragged_buffer-0.4.2-cp38-none-win_amd64.whl
Algorithm Hash digest
SHA256 05c943463b015c686ff36d441a7214e8fa026c5f70f9acfcaf41b6ba83702beb
MD5 c0fbae6f92b43e52811be3faac7a0e77
BLAKE2b-256 323f62880bc8a0570d201c32514a48222e9fb00a717a6d085cb9f9f0ce4369a4

See more details on using hashes here.

File details

Details for the file ragged_buffer-0.4.2-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for ragged_buffer-0.4.2-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 44daa961e880433ec9fa5d22ff117f2bfed1ef28d851e0eace36947104f2ee2a
MD5 9fc6825b1f4605126b51e69b5b51e5f4
BLAKE2b-256 8c92f90e007bb91d57763ce3404712a02eb64dfaa14e9aff5f76ef42cc18a57a

See more details on using hashes here.

File details

Details for the file ragged_buffer-0.4.2-cp38-cp38-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for ragged_buffer-0.4.2-cp38-cp38-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 57b1fc7a9e5a301eb75575652264f7b9b79dd5f9ea1eff01af52c8688ac239a4
MD5 7a15aa9a912a72ceff9748e7b21b443d
BLAKE2b-256 e54f59b80f17b65bc63a14aa0af761461da58d8dbcf7a1c8850232c189a04fb9

See more details on using hashes here.

File details

Details for the file ragged_buffer-0.4.2-cp37-none-win_amd64.whl.

File metadata

File hashes

Hashes for ragged_buffer-0.4.2-cp37-none-win_amd64.whl
Algorithm Hash digest
SHA256 4aa0bfe5adba02048b46b2a24f983ce54acbbef8309748f189ee75b6e601bc95
MD5 cd1ab201a3c93b033ff3223b06813e6c
BLAKE2b-256 c392f73943df7650f17a145a7ab4746f174a3450f4260cc19a13e968ca260844

See more details on using hashes here.

File details

Details for the file ragged_buffer-0.4.2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for ragged_buffer-0.4.2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 fac9598af042bc3a5c65cb1490df9fde6cf1365af6eb22973ea3a420e5d1b160
MD5 6963fb94bd8624c0533054861047e80e
BLAKE2b-256 d6b90fd2216d66e58ed99ee3a76af8a2a59f9c5bf33e6fb30a4485dcee43df2e

See more details on using hashes here.

File details

Details for the file ragged_buffer-0.4.2-cp37-cp37m-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for ragged_buffer-0.4.2-cp37-cp37m-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 5003f35c9666f5409a1cc995e638cb671b82a977648a753837a44d27f6e4f1c0
MD5 14cdd25a53587d65893b021d0f36dac6
BLAKE2b-256 9d594d190a3cfd606e5949f2658f58d62a8b7d054158e13973ee27ef9a75e05d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page