Help the Python Software Foundation raise \$60,000 USD by December 31st!

## Project description

A minimal task scheduling abstraction and parallel arrays.

• dask.array is a drop-in NumPy replacement (for a subset of NumPy) that encodes blocked algorithms in dask dependency graphs.
• dask.async is a shared-memory asynchronous scheduler that efficiently executes dask dependency graphs on multiple cores.

## Install

```python setup.py install
```

Consider the following simple program:

```def inc(i):
return i + 1

return a + b

x = 1
y = inc(x)
```

We encode this as a dictionary in the following way:

```d = {'x': 1,
'y': (inc, 'x'),
```

While less aesthetically pleasing this dictionary may now be analyzed, optimized, and computed on by other Python code, not just the Python interpreter.

The dask.array module creates these graphs from NumPy-like operations

```>>> import dask.array as da
>>> x = da.random.random((4, 4), blockshape=(2, 2))
{('x', 0, 0): (np.random.random, (2, 2)),
('x', 0, 1): (np.random.random, (2, 2)),
('x', 1, 0): (np.random.random, (2, 2)),
('x', 1, 1): (np.random.random, (2, 2)),
('y', 0, 0): (np.transpose, ('x', 0, 0)),
('y', 0, 1): (np.transpose, ('x', 1, 0)),
('y', 1, 0): (np.transpose, ('x', 0, 1)),
('y', 1, 1): (np.transpose, ('x', 1, 1)),
('z',): (getitem, ('y', 0, 1), (0, 1))}
```

Finally, a scheduler executes these graphs to achieve the intended result. The dask.async module contains a shared memory scheduler that efficiently leverages multiple cores.

## Dependencies

dask.core supports Python 2.6+ and Python 3.3+ with a common codebase. It is pure Python and requires no dependencies beyond the standard library. It is a light weight dependency.

dask.bag depends on toolz and dill.