Skip to main content

Analytics library

Project description

Python Data Streams

Build Status GitHub issues Coverage PyPI PyPI Docs

Tributary is a library for constructing dataflow graphs in python. Unlike many other DAG libraries in python (airflow, luigi, prefect, dagster, dask, kedro, etc), tributary is not designed with data/etl pipelines or scheduling in mind. Instead, tributary is more similar to libraries like mdf, pyungo, streamz, or pyfunctional, in that it is designed to be used as the implementation for a data model. One such example is the greeks library, which leverages tributary to build data models for options pricing.

Installation

Install from pip:

pip install tributary

or from source

python setup.py install

Stream Types

Tributary offers several kinds of streams:

Streaming

These are synchronous, reactive data streams, built using asynchronous python generators. They are designed to mimic complex event processors in terms of event ordering.

Functional

These are functional streams, built by currying python functions (callbacks).

Lazy

These are lazily-evaluated python streams, where outputs are propogated only as inputs change. They are implemented as directed acyclic graphs.

Examples

  • Streaming: In this example, we construct a variety of forward propogating reactive graphs.
  • Lazy: In this example, we construct a variety of lazily-evaluated directed acyclic computation graphs.
  • Automatic Differentiation: In this example, we use tributary to perform automatic differentiation on both lazy and streaming graphs.

Graph Visualization

You can visualize the graph with Graphviz. All streaming and lazy nodes support a graphviz method.

Streaming and lazy nodes also support ipydagred3 for live update monitoring.

Streaming

Here green indicates executing, yellow indicates stalled for backpressure, and red indicates that StreamEnd has been propogated (e.g. stream has ended).

Lazy

Here green indicates executing, and red indicates that the node is dirty. Note the the determination if a node is dirty is also done lazily (we can check with isDirty whcih will update the node's graph state.

Sources and Sinks

Sources

  • Python Function/Generator/Async Function/Async Generator
  • Random - generates a random dictionary of values
  • File - streams data from a file, optionally loading each line as a json
  • Kafka - streams data from kafka
  • Websocket - strams data from a websocket
  • Http - polls a url with GET requests, streams data out
  • SocketIO - streams data from a socketIO connection

Sinks

  • File - data to a file
  • Kafka - streams data to kafka
  • Http - POSTs data to an url
  • Websocket - streams data to a websocket
  • SocketIO - streams data to a socketIO connection

Transforms

Modulate

  • Delay - Streaming wrapper to delay a stream
  • Apply - Streaming wrapper to apply a function to an input stream
  • Window - Streaming wrapper to collect a window of values
  • Unroll - Streaming wrapper to unroll an iterable stream
  • UnrollDataFrame - Streaming wrapper to unroll a dataframe into a stream
  • Merge - Streaming wrapper to merge 2 inputs into a single output
  • ListMerge - Streaming wrapper to merge 2 input lists into a single output list
  • DictMerge - Streaming wrapper to merge 2 input dicts into a single output dict. Preference is given to the second input (e.g. if keys overlap)
  • Reduce - Streaming wrapper to merge any number of inputs

Calculations

Arithmetic Operators

  • Noop (unary) - Pass input to output
  • Negate (unary) - -1 * input
  • Invert (unary) - 1/input
  • Add (binary) - add 2 inputs
  • Sub (binary) - subtract second input from first
  • Mult (binary) - multiple inputs
  • Div (binary) - divide first input by second
  • RDiv (binary) - divide second input by first
  • Mod (binary) - first input % second input
  • Pow (binary) - first input^second input
  • Sum (n-ary) - sum all inputs
  • Average (n-ary) - average of all inputs

Boolean Operators

  • Not (unary) - Not input
  • And (binary) - And inputs
  • Or (binary) - Or inputs

Comparators

  • Equal (binary) - inputs are equal
  • NotEqual (binary) - inputs are not equal
  • Less (binary) - first input is less than second input
  • LessOrEqual (binary) - first input is less than or equal to second input
  • Greater (binary) - first input is greater than second input
  • GreaterOrEqual (binary) - first input is greater than or equal to second input

Math

  • Log (unary)
  • Sin (unary)
  • Cos (unary)
  • Tan (unary)
  • Arcsin (unary)
  • Arccos (unary)
  • Arctan (unary)
  • Sqrt (unary)
  • Abs (unary)
  • Exp (unary)
  • Erf (unary)

Converters

  • Int (unary)
  • Float (unary)
  • Bool (unary)
  • Str (unary)

Python Builtins

  • Len (unary)

Rolling

  • RollingCount - Node to count inputs
  • RollingMin - Node to take rolling min of inputs
  • RollingMax - Node to take rolling max of inputs
  • RollingSum - Node to take rolling sum inputs
  • RollingAverage - Node to take the running average
  • SMA - Node to take the simple moving average over a window
  • EMA - Node to take an exponential moving average over a window

Node Type Converters

  • Lazy->Streaming

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tributary-0.1.4.tar.gz (52.8 kB view details)

Uploaded Source

Built Distribution

tributary-0.1.4-py2.py3-none-any.whl (74.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file tributary-0.1.4.tar.gz.

File metadata

  • Download URL: tributary-0.1.4.tar.gz
  • Upload date:
  • Size: 52.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7

File hashes

Hashes for tributary-0.1.4.tar.gz
Algorithm Hash digest
SHA256 bb628f2fbdbbb581817544886c92e5bdf34aefd57440a1677eee946eea802f70
MD5 65d15e16c0326f147bea912b2cbb84dc
BLAKE2b-256 06b541afba7e7daeb14103728e9d08e771aaca158027c238e55a8696f9a0bab3

See more details on using hashes here.

Provenance

File details

Details for the file tributary-0.1.4-py2.py3-none-any.whl.

File metadata

  • Download URL: tributary-0.1.4-py2.py3-none-any.whl
  • Upload date:
  • Size: 74.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.0.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.7

File hashes

Hashes for tributary-0.1.4-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 3ea18150b270150dd97a67b0d00233c83bc23b2a1a41a3c979492d34166b122e
MD5 e9ac2063c47ef6fd78ecc5028359555e
BLAKE2b-256 db98f8f27c99a6219da38ab3908ab03109f62aebc89d5ff522249df4974b35f6

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page