No project description provided
Project description
Random Access Archive
.raa files are essentially a dict header + consecutive bytes of the samples. It was made to faccilitate and accelerate deep learning training on large datasets. It's written in Rust and fast, but easily accesible programmatically in Python. Most importantly, it allows you to shuffle the data, without sacrificing too much on sequential reads, by shuffling blocks of contiguous data. It also allows for lazy sharding.
Comparison
The main advantage of this library, is how extensible it is. Other libraries like Webdataset, FFCV, Streaming Dataset, TF Record, are very batteries included, which is great for experimentation, but sacrifices on extensibility heavily since they also include data processing. Our philosiphy quite simple, you write string byte pairs, and you read string byte pairs. We only implement functionality that NEEDS to be implemented at the reader level for optimization, like shuffling and sharding.
Benchmarks:
!todo
Usage
pip install rand-archive
Writing:
from rand_archive import Writer
with Writer("test.raa") as w:
w.write("test", bytes("test"))
Reading
from rand_archive import Reader
for _ in Reader().open_file("dummy.raa").with_shuffling():
pass
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for rand_archive-0.2.3-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fbae0d8fe4470d687381e7ff8697da9894873617a9eb56324b9a04a2c3439365 |
|
MD5 | ba3d5ca1b7e7917b60bf9b5ab249157b |
|
BLAKE2b-256 | b83749e39f8a18da888368dcb4385938e7cc1a45fb45c8d2c776474b4e8834aa |
Hashes for rand_archive-0.2.3-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 29dda304dac62672200be0cd70a86b4177cdaf0dc1be206f3a34e0f52d10c42f |
|
MD5 | 89d19929c0d478945d70712d6571505f |
|
BLAKE2b-256 | 845968052a93a63662628c0ca966836a8df6942d1d7046371df12ba3c74940a4 |
Hashes for rand_archive-0.2.3-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5e13185362a070f90e738e6b5a38350467a04162aeaa87750bbf3e647479ce55 |
|
MD5 | e8e6f5a46471623ceebdee1ccc1c4d3e |
|
BLAKE2b-256 | bd208de177257a65dc8e97c709cfcdcc9d69cbd554cabe3fd5c246d7bc402092 |
Hashes for rand_archive-0.2.3-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 88dd1153697e526c82a3cc33d145675289c6f1b33d0b1bfe2171a71937ce7ecc |
|
MD5 | 6626cdb5941fe33e3d460d0b3f4ebfb3 |
|
BLAKE2b-256 | bdb5e6c54cf995bc67e7c62f29c9f856a03ece2dd186a20c9c44793df12106aa |
Hashes for rand_archive-0.2.3-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bbd76e42083082ce3a86da01060ec01e7dd0f768c5fd75c97d42c897e15499e7 |
|
MD5 | eddf2e9a3df7a309f4925dd227064994 |
|
BLAKE2b-256 | 3b413dc447524b1c14b18654b6e7d69ae9697c792201d687248c03ee6b7736cd |
Hashes for rand_archive-0.2.3-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 468f689bbe620b87f9e912340400a3bc4ab050a13051528988f7343360601a46 |
|
MD5 | e1e816b350ab675665a96a89fc7ad74f |
|
BLAKE2b-256 | 87e5695825a9671a39cbc3ab5d927809814837a224d3398ac8a1444e13d38f9c |
Hashes for rand_archive-0.2.3-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5a41c86a5ad2fb18cc642cb27095d36741863fdcd0678b1daa3149a5e7c5f622 |
|
MD5 | 4f037d758ff8c8bfef272bf81eed6e81 |
|
BLAKE2b-256 | 583bd237e1617caa3c801a0e47cebdfadc7752f7d913d99461af644c18258786 |
Hashes for rand_archive-0.2.3-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c90a80a57f8774810e3c41dc595bfb8f60b9a26a046e87dc439791578762f15 |
|
MD5 | 8feeca5595092022eaeb8d4665b8412e |
|
BLAKE2b-256 | ccf916322e21e092a5cdc4ca171eec017138cc7a1943211be92ecec242108cf4 |
Hashes for rand_archive-0.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 63c715311e6c533c1e960f1651cc3dc534e9d69c334172eaae455a20eada2252 |
|
MD5 | ff94e61c67b111542c5620f96ecb8a0c |
|
BLAKE2b-256 | 4868cfe351038e784ad7ef2452858cbee36707097b078f3aea673003f153f222 |
Hashes for rand_archive-0.2.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a3c7e7f9306bc2b740bc46288475f2a2bd91f0960807ff9a661dac03f2a57874 |
|
MD5 | 5aed99fdae8faeb604017aa646306f77 |
|
BLAKE2b-256 | 201448afec1d13a4b1820811877221d2de2ae7beacc0c9808e2b2affefb255b8 |
Hashes for rand_archive-0.2.3-cp311-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 71ee87ac4ebd4ad87f5756ed00112c46b2780101a288d23b341b50ccf36e57b0 |
|
MD5 | 1523bf7dfc17fea5affdc41180374c0b |
|
BLAKE2b-256 | 957a51f28006bdec5294402ee851ec4a70f160ed5a15dbff4cf3385a904e3e20 |
Hashes for rand_archive-0.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 61704854de332b5f40ba6c18399d0f99f3d61304cd2773445518391f7e686a8f |
|
MD5 | 8bc5ae39d211738b2de95c7768c3b509 |
|
BLAKE2b-256 | 07d49c32c525204ecd311cc4d50405afe37a257d547a8556c17856689c24076d |
Hashes for rand_archive-0.2.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 743fff5cbaa5c05c31dc489cf6dcfe89febddb8f663c51fc7066e92af4a99c97 |
|
MD5 | 463abec331ee18578c6ad2d4453c1ca5 |
|
BLAKE2b-256 | f6a49ceaa40572d15435fc9c8e5e6e7a5d286a0c9f6aee4831c4389c76f02423 |
Hashes for rand_archive-0.2.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3c066d7b8e9a6418e8786ff306639d031e0ca347ba0d0e699cf45256b80e90ac |
|
MD5 | 80632d1ae66a27e94c4f1d52ea41a041 |
|
BLAKE2b-256 | adfd5d89024693cdc5979262b3ecc85a09dbccbe58b527804b3768fc3f46f30d |
Hashes for rand_archive-0.2.3-cp310-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 92d79f908ecedce1fdb075a7eba41ae234ddfa99da8f41aef2e7d721a126226c |
|
MD5 | 3346c5935805d7ef5c6a582317abb9e8 |
|
BLAKE2b-256 | 47099e966cdf934630272456715fd2378f3a9aaa5efc01a71b0458d608aa4045 |
Hashes for rand_archive-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3964f6061f4ff6479214f66b3dfbfa2029fb6834990bc1d46a03f1be48032976 |
|
MD5 | 97bbf99f1ff5c67dc931989cc52bf2f7 |
|
BLAKE2b-256 | 76a42f2c91cf7a0604078bede1d916f48fa9e7685f18da58b8cd3ec82e031768 |
Hashes for rand_archive-0.2.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 02611b80c870fef2428642d83b181397472c69a8a072b1fa1596ea52c9c7b5ba |
|
MD5 | db7d68c3d3ed2ebf1d99c072ac24223e |
|
BLAKE2b-256 | aab2ea4de3a533411133416fc4142bb2d07e158dc47e6bf13de2b104c747bf74 |
Hashes for rand_archive-0.2.3-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5aa88e301b7f15d5ab5323400f953c63ec77c0a378e406f0a209a8a54aef1ea5 |
|
MD5 | ffea51ba8e971745df6851e981f30002 |
|
BLAKE2b-256 | 16b096cc9fc5361308cef7adff0963c4caed15ad6b2cec2234702e9b426b2416 |
Hashes for rand_archive-0.2.3-cp39-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2a3ba2a32514cd10d3454d2a804672ddb3b49e4b0d05e6229b9803e7b2b4d8e9 |
|
MD5 | 44b25fe62191dee26dc2a8292fae0827 |
|
BLAKE2b-256 | c0c552a4a89c303d60497f4afb05799a6053df6d85c4536110cafd25643b9275 |
Hashes for rand_archive-0.2.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c52418b9278b511b37af72d0888a2e4ab05fc67c55ea1505771b0cf7d41c1fe5 |
|
MD5 | 4cb9393b32732391837c558e43025500 |
|
BLAKE2b-256 | 24d5a1e410018343e3381c7b51283c2e4f54a21cf9ab9cebb7153b3ac56fb323 |
Hashes for rand_archive-0.2.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ac3c9aa1f9b48411562e4ebe3e99c2d48b878cb44545606779606971270fd1e5 |
|
MD5 | a204afc0c434c360bac97af04ce26b1f |
|
BLAKE2b-256 | b6349a1a6076ada2a472af86d451949a9f46b9cb61d6b2c6f45af44f57141fa2 |
Hashes for rand_archive-0.2.3-cp38-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7dddd2dcad3c48ef7dc5a9458a8c9440dfa86eeeb00f1b6ea05d92b076cd3dc6 |
|
MD5 | 456e2cc4461c759267a3e88cc0abc053 |
|
BLAKE2b-256 | d8b8e3dc973841472524c669f95a9ffb40a12ff03712ef201356c536907216bf |
Hashes for rand_archive-0.2.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0be007156987f6ecd60f40ba4e00e75cf92169b049a7db7668b6f5015bcd8b38 |
|
MD5 | 2145a2bbceb38fa1b930abd1c623502d |
|
BLAKE2b-256 | 76f150dc8a5adda7810ae3e857a6ed4b686d5103116a50028ff095816b18f18c |
Hashes for rand_archive-0.2.3-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a7b9897e5a4365ea1387f7a76df504d45755c5d8595ce0e238fffa487ed395db |
|
MD5 | 413b9612f6cd3d783e259b3bb4286a55 |
|
BLAKE2b-256 | 6f0697e1e24b9aafe1caeadf83b2234d6ba44a10517020331b4ccb3e57c11b95 |
Hashes for rand_archive-0.2.3-cp37-none-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f36ce3295e3bcc3b5255d7041cf9b6653264c592c903992fe7b5a3ffc8bf76b6 |
|
MD5 | 1d1dd1b1cee94fbd8c3164d6afdab332 |
|
BLAKE2b-256 | 70488e1585d4b7e943c478bc656bdddc9e991036eed5a5e8597e56b110d0e3d7 |
Hashes for rand_archive-0.2.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 926f5a01e0342acb8739d6ecc8083e1d1440b2b532ccef85291aa2a6bdae70d6 |
|
MD5 | 4d7ace868125d375484e470a7cea2eae |
|
BLAKE2b-256 | 7e87eb47249084450bf7f4aad3a5b31983c7eaff88d9322938400702d12caa3c |
Hashes for rand_archive-0.2.3-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b4350af3c6f7f331876e7bb38174526be0837423ecadf19000b5d11df13f817 |
|
MD5 | f8ac279192a1a634fe177210bbca68d0 |
|
BLAKE2b-256 | 0b4d5907bff01b162272bc7059b0c2c4055dd92e0366bb5ad08a774ca0d6c12b |