Skip to main content

The array_split python package is an enhancement to existing numpy.ndarray functions (such as numpy.array_split) which sub-divide a multi-dimensional array into a number of multi-dimensional sub-arrays (slices)

Project description

array_split python package array_split python package Documentation Status Coveralls Status MIT License array_split python package https://zenodo.org/badge/DOI/10.5281/zenodo.889078.svg http://joss.theoj.org/papers/10.21105/joss.00373/status.svg

The array_split python package is an enhancement to existing numpy.ndarray functions, such as numpy.array_split, skimage.util.view_as_blocks and skimage.util.view_as_windows, which sub-divide a multi-dimensional array into a number of multi-dimensional sub-arrays (slices). Example application areas include:

Parallel Processing

A large (dense) array is partitioned into smaller sub-arrays which can be processed concurrently by multiple processes (multiprocessing or mpi4py) or other memory-limited hardware (e.g. GPGPU using pyopencl, pycuda, etc). For GPGPU, it is necessary for sub-array not to exceed the GPU memory and desirable for the sub-array shape to be a multiple of the work-group (OpenCL) or thread-block (CUDA) size.

File I/O

A large (dense) array is partitioned into smaller sub-arrays which can be written to individual files (as, for example, a HDF5 Virtual Dataset). It is often desirable for the individual files not to exceed a specified number of (Giga) bytes and, for HDF5, it is desirable to have the individual file sub-array shape a multiple of the chunk shape. Similarly, out of core algorithms for large dense arrays often involve processing the entire data-set as a series of in-core sub-arrays. Again, it is desirable for the individual sub-array shape to be a multiple of the chunk shape.

The array_split package provides the means to partition an array (or array shape) using any of the following criteria:

  • Per-axis indices indicating the cut positions.

  • Per-axis number of sub-arrays.

  • Total number of sub-arrays (with optional per-axis number of sections constraints).

  • Specific sub-array shape.

  • Specification of halo (ghost) elements for sub-arrays.

  • Arbitrary start index for the shape to be partitioned.

  • Maximum number of bytes for a sub-array with constraints:

    • sub-arrays are an even multiple of a specified sub-tile shape

    • upper limit on the per-axis sub-array shape

Quick Start Example

>>> from array_split import array_split, shape_split
>>> import numpy as np
>>>
>>> ary = np.arange(0, 4*9)
>>>
>>> array_split(ary, 4) # 1D split into 4 sections (like numpy.array_split)
[array([0, 1, 2, 3, 4, 5, 6, 7, 8]),
 array([ 9, 10, 11, 12, 13, 14, 15, 16, 17]),
 array([18, 19, 20, 21, 22, 23, 24, 25, 26]),
 array([27, 28, 29, 30, 31, 32, 33, 34, 35])]
>>>
>>> shape_split(ary.shape, 4) # 1D split into 4 parts, returns slice objects
array([(slice(0, 9, None),), (slice(9, 18, None),), (slice(18, 27, None),), (slice(27, 36, None),)],
      dtype=[('0', 'O')])
>>>
>>> ary = ary.reshape(4, 9) # Make ary 2D
>>> split = shape_split(ary.shape, axis=(2, 3)) # 2D split into 2*3=6 sections
>>> split.shape
(2, 3)
>>> split
array([[(slice(0, 2, None), slice(0, 3, None)),
        (slice(0, 2, None), slice(3, 6, None)),
        (slice(0, 2, None), slice(6, 9, None))],
       [(slice(2, 4, None), slice(0, 3, None)),
        (slice(2, 4, None), slice(3, 6, None)),
        (slice(2, 4, None), slice(6, 9, None))]],
      dtype=[('0', 'O'), ('1', 'O')])
>>> sub_arys = [ary[tup] for tup in split.flatten()] # Create sub-array views from slice tuples.
>>> sub_arys
[array([[ 0,  1,  2], [ 9, 10, 11]]),
 array([[ 3,  4,  5], [12, 13, 14]]),
 array([[ 6,  7,  8], [15, 16, 17]]),
 array([[18, 19, 20], [27, 28, 29]]),
 array([[21, 22, 23], [30, 31, 32]]),
 array([[24, 25, 26], [33, 34, 35]])]

Latest sphinx documentation (including more examples) at http://array-split.readthedocs.io/en/latest/.

Installation

Using pip (root access required):

pip install array_split

or local user install (no root access required):

pip install --user array_split

or local user install from latest github source:

pip install --user git+git://github.com/array-split/array_split.git#egg=array_split

Requirements

Requires numpy version >= 1.6, python-2 version >= 2.6 or python-3 version >= 3.2.

Testing

Run tests (unit-tests and doctest module docstring tests) using:

python -m array_split.tests

or, from the source tree, run:

python setup.py test

Travis CI at:

https://travis-ci.org/array-split/array_split/

and AppVeyor at:

https://ci.appveyor.com/project/array-split/array-split

Documentation

Latest sphinx generated documentation is at:

http://array-split.readthedocs.io/en/latest

and at github gh-pages:

https://array-split.github.io/array_split/

Sphinx documentation can be built from the source:

python setup.py build_sphinx

with the HTML generated in docs/_build/html.

Latest source code

Source at github:

https://github.com/array-split/array_split

Bug Reports

To search for bugs or report them, please use the bug tracker at:

https://github.com/array-split/array_split/issues

Contributing

Check out the CONTRIBUTING doc.

License information

See the file LICENSE.txt for terms & conditions, for usage and a DISCLAIMER OF ALL WARRANTIES.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

array_split-0.6.5.zip (41.8 kB view details)

Uploaded Source

Built Distribution

array_split-0.6.5-py2.py3-none-any.whl (33.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file array_split-0.6.5.zip.

File metadata

  • Download URL: array_split-0.6.5.zip
  • Upload date:
  • Size: 41.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for array_split-0.6.5.zip
Algorithm Hash digest
SHA256 267953eb84808a70bba631a5ca1d516cfbb732d6e03ffbf7fa979c4f3ca69673
MD5 4d577654b74367afabc2ff5643bc606d
BLAKE2b-256 47b4f88e32eeee2ee6172244f8d19b374f1f5d6c3a1de9ba7c1712fbc7a2e176

See more details on using hashes here.

File details

Details for the file array_split-0.6.5-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for array_split-0.6.5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 b33614b9af58c5ae8154b2fb34f68ad75786882d8e0abd6ea25b970a2b11eb92
MD5 78dc1e0817004e9cd4cd4dfdf9c31e41
BLAKE2b-256 67f53558e3be19b839b321443f0421a7e1b83965e1ec18060c48bb546c41ce0d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page