Skip to main content

Transform directed acyclic graphs using map-reduce and groupby operations

Project description

Contributor Covenant PyPI badge Anaconda-Server Badge License: BSD 3-Clause

Cyclebane

About

Transform directed acyclic graphs using map-reduce and groupby operations

This library is an attempt to merge the concepts of directed acyclic graphs (DAG) with array-like objects such as NumPy arrays, Pandas DataFrames, or Xarray/Scipp DataArrays. This could be useful for describing tasks graphs, e.g., when a series of tasks is applied to chunks of an array. These tasks also have an array structure. After an reduction operation of chunks, the graph loses this structure, i.e., only a subset of the graph's nodes has array structure. What if we could work with this structure, even though only parts of the graph follows it? And what if we could use the power of array slicing with named dimensions, or select by label? This is what Cyclebane tries to do.

Our initial goal is to support:

  • map operations of a DAG's source nodes over an array-like (https://docs.dask.org/en/latest/high-level-graphs.html). Cyclebane will effectively copy all descendants of those nodes, once for each array element. Cyclebane will support joint mappings of multiple source nodes by mapping over, e.g., a DataFrame with multiple columns, as well as chaining independent map operations at different source nodes. In the latter case this will effectively broadcast at descendant nodes that depend on multiple such source nodes.
  • reduce operations at descendants of mapped nodes. This will add a new node with edges to all copies of the mapped node being reduced. Cyclebane will support reducing only individual axes or all axes, similar to Numpy.
  • groupby operations similar to Pandas and Xarray (albeit more limited).
  • Positional and label-based indexing. Cyclebane will support selecting branches that were creating during map (or groupby) operations based on their indices. The graph structure will be left untouched, i.e., nodes after a reduce operation will be preserved, but fewer edges will lead to the reduce node.

See also Dask's High Level Graphs for a related concept (without the direct support for any such operations).

Installation

python -m pip install cyclebane

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cyclebane-24.10.0.tar.gz (42.4 kB view details)

Uploaded Source

Built Distribution

cyclebane-24.10.0-py3-none-any.whl (13.6 kB view details)

Uploaded Python 3

File details

Details for the file cyclebane-24.10.0.tar.gz.

File metadata

  • Download URL: cyclebane-24.10.0.tar.gz
  • Upload date:
  • Size: 42.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.7

File hashes

Hashes for cyclebane-24.10.0.tar.gz
Algorithm Hash digest
SHA256 aa1aeb24f6915c8c55c85fcebdb3235544ab4e319a2a3c17aca634f6a0f0360b
MD5 bb14de8031ac3054525671c08493941b
BLAKE2b-256 22ed0f1c567a755b9db849e9bcef78983c4a3dcf23c846e4aa5709ff096f363e

See more details on using hashes here.

File details

Details for the file cyclebane-24.10.0-py3-none-any.whl.

File metadata

  • Download URL: cyclebane-24.10.0-py3-none-any.whl
  • Upload date:
  • Size: 13.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.7

File hashes

Hashes for cyclebane-24.10.0-py3-none-any.whl
Algorithm Hash digest
SHA256 902dd318667e4a222afc270cc5bc72c67d5d6047d2e0e1c36018885fb80f5e5d
MD5 1efca101aa6f23d1194cf97312f544c9
BLAKE2b-256 42d927b13bc9419bf5dae02905b348f16ca827646cd76244ddd326f1a8139a6a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page