Cloud implementation of array for Big Data
Project description
Cloud Array
cloud-array
is an open-source Python library for storing and streaming large Numpy Arrays on local file systems and major cloud providers CDNs. It automatically chunks a large array of data into arbitrary chunks sizes and uploads them into the targeted direcotry.
import numpy as np
from cloud_array import CloudArray
shape = (10000, 100, 100)
chunk_shape = (10, 10, 10)
f = np.memmap(
'memmapped.dat',
dtype=np.float32,
mode='w+',
shape=shape
)
array = CloudArray(
chunk_shape=chunk_shape,
array=f,
url="s3://example_bucket/dataset0"
)
array.save()
print(array[:100,:100,:100])
Links
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cloud_array-0.0.6.tar.gz
(6.8 kB
view hashes)
Built Distribution
Close
Hashes for cloud_array-0.0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1fad6e7819c8e1195a107eb9856e52ae9bf45f5e2eac7dfbdb36d6c13869d534 |
|
MD5 | 3ddc664227520ab6d264ea6bba8976f3 |
|
BLAKE2b-256 | 63db43deceba5e032b61d0cf8dc281e8dc2bb3e9ec6c7d925fb752a6da2886f6 |