An open-source python library for writing large amounts of data to buffers via chunks
Project description
pychunkbuffers
An open-source python library for writing large amounts of data to buffers via chunks.
Description
This repositiory contains the source code for the pychunkbuffers
library. I came up with the idea for this library while making my other project AdityaIyer2k7/image-file-hider. In that project, I often had to write large amounts of data (hundreds of megabytes) to lists and buffers. Doing this byte-by-byte took a lot of time, so instead I came up with the solution of chunking.
Basically, let us say we have a for
loop that has to run 10^8 times, and each time it adds a value to a list. In a chunked implementation, you would pre-define this list like this:
[0]*10**8
and then create a function that goes from index a to b and updates that value of the list like this:
def func(startidx, endidx):
for i in range(startidx, endidx):
LIST[i] = SOMEVALUE
However, if we run func(0, 10**8)
, we are still running 10^8 iterations in sequence. Instead, we can run parts like func(0, 10000)
, func(10000, 20000)
and so on simultaneously on threads. With this library, we can simply use the line
run_chunked(func, 10000, 0, 10**8) # Where 10000 is our chunk size, while 0 and 10**8 are our bounds
Now, we would like to check when all chunks have completed their tasks. The library implements this using a completion status list. The run_chunked
function returns a list of boolean values which are all False
when the chunks start. Whenever a chunk finishes its task, that specific chunk's status is set to True
in the list. If we want to wait for all the chunks to finish, we can use a line like this:
while not all(STATUS): pass
Example implementation:
# Task: To write the squares values for numbers 1 to 10**8 (inclusive)
squares = [0]*10**8
CHUNKSIZE = 10**5
def func(startidx, endidx):
for i in range(startidx, endidx):
squares[i] = (i+1)**2
status = run_chunked(func, CHUNKSIZE, 0, len(squares))
while not all(status): pass
print("Done")
print(squares[:100])
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pychunkbuffers-1.0.4.tar.gz
.
File metadata
- Download URL: pychunkbuffers-1.0.4.tar.gz
- Upload date:
- Size: 15.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 98071e55bddcbec5fc6a5c351cebeb000bf25165720ea4bf77cb4c0d28d80f7a |
|
MD5 | b3645b545c406929bad9511818ffb25e |
|
BLAKE2b-256 | faa45280372fb448de383573cbd521c8912df9d08c5e63e510a34b9a535ab185 |
File details
Details for the file pychunkbuffers-1.0.4-py3-none-any.whl
.
File metadata
- Download URL: pychunkbuffers-1.0.4-py3-none-any.whl
- Upload date:
- Size: 15.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9e5b387ef41f1c8535b712cb7b39fd1c3d9d417a20cb0feef1e40f591119d518 |
|
MD5 | 998864e25d13bc6b0b506c0091adb8e8 |
|
BLAKE2b-256 | 3c839c7f8c41cb3c918d5e552cd4ffc6aca3e1fada9893137ce593abbcd74419 |