Manipulate JSON-like data with NumPy-like idioms.

## Project description

Awkward Array is a library for nested, variable-sized data, including arbitrary-length lists, records, mixed types, and missing data, using NumPy-like idioms.

Arrays are dynamically typed, but operations on them are compiled and fast. Their behavior coincides with NumPy when array dimensions are regular and generalizes when they're not.

# Motivating example

Given an array of lists of objects with x, y fields (with nested lists in the y field),

import awkward as ak

array = ak.Array([
[{"x": 1.1, "y": [1]}, {"x": 2.2, "y": [1, 2]}, {"x": 3.3, "y": [1, 2, 3]}],
[],
[{"x": 4.4, "y": [1, 2, 3, 4]}, {"x": 5.5, "y": [1, 2, 3, 4, 5]}]
])


the following slices out the y values, drops the first element from each inner list, and runs NumPy's np.square function on everything that is left:

output = np.square(array["y", ..., 1:])


The result is

[
[[], [4], [4, 9]],
[],
[[4, 9, 16], [4, 9, 16, 25]]
]


The equivalent using only Python is

output = []
for sublist in array:
tmp1 = []
for record in sublist:
tmp2 = []
for number in record["y"][1:]:
tmp2.append(np.square(number))
tmp1.append(tmp2)
output.append(tmp1)


The expression using Awkward Arrays is more concise, using idioms familiar from NumPy, and it also has NumPy-like performance. For a similar problem 10 million times larger than the one above (single-threaded on a 2.2 GHz processor),

• the Awkward Array one-liner takes 1.5 seconds to run and uses 2.1 GB of memory,
• the equivalent using Python lists and dicts takes 140 seconds to run and uses 22 GB of memory.

Awkward Array is even faster when used in Numba's JIT-compiled functions.

# Installation

Awkward Array can be installed from PyPI using pip:

pip install awkward


The awkward package is pure Python, and it will download the awkward-cpp compiled components as a dependency. If there is no awkward-cpp binary package (wheel) for your platform and Python version, pip will attempt to compile it from source (which has additional dependencies, such as a C++ compiler).

Awkward Array is also available on conda-forge:

conda install -c conda-forge awkward


## Project details

Uploaded source
Uploaded py3