Skip to main content

Mrocklin streamz library. It's the latest version

Project description

Streamz
=======

Streamz helps you build pipelines to manage continuous streams of data. It is
simple to use in simple cases, but also supports complex pipelines that involve
branching, joining, flow control, feedback, back pressure, and so on.

Optionally, Streamz can also work with Pandas dataframes to provide sensible
streaming operations on continuous tabular data.

To learn more about how to use streams, visit :doc:`Core documentation <core>`.


Motivation
----------

Continuous data streams arise in many applications like the following:

1. Log processing from web servers
2. Scientific instrument data like telemetry or image processing pipelines
3. Financial time series
4. Machine learning pipelines for real-time and on-line learning
5. ...

Sometimes these pipelines are very simple, with a linear sequence of processing
steps:

.. image:: docs/source/images/simple.svg
:alt: a simple streamz pipeline

And sometimes these pipelines are more complex, involving branching, look-back
periods, feedback into earlier stages, and more.

.. image:: docs/source/images/complex.svg
:alt: a more complex streamz pipeline

Streamz endeavors to be simple in simple cases, while also being powerful
enough to let you define custom and powerful pipelines for your application.

Why not Python generator expressions?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Python users often manage continuous sequences of data with iterators or
generator expressions.

.. code-block:: python

def fib():
a, b = 0, 1
while True:
yield a
a, b = b, a + b

sequence = (f(n) for n in fib())

However iterators become challenging when you want to fork them or control the
flow of data. Typically people rely on tools like ``itertools.tee``, and
``zip``.

.. code-block:: python

x1, x2 = itertools.tee(x, 2)
y1 = map(f, x1)
y2 = map(g, x2)

However this quickly become cumbersome, especially when building complex
pipelines.


Related Work
------------

Streamz is similar to reactive
programming systems like `RxPY <https://github.com/ReactiveX/RxPY>`_ or big
data streaming systems like `Apache Flink <https://flink.apache.org/>`_,
`Apache Beam <https://beam.apache.org/get-started/quickstart-py/>`_ or
`Apache Spark Streaming <https://beam.apache.org/get-started/quickstart-py/>`_.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

streamz_latest-0.3.0.tar.gz (61.4 kB view details)

Uploaded Source

Built Distribution

streamz_latest-0.3.0-py2.py3-none-any.whl (47.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file streamz_latest-0.3.0.tar.gz.

File metadata

  • Download URL: streamz_latest-0.3.0.tar.gz
  • Upload date:
  • Size: 61.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.19.5 CPython/3.6.5

File hashes

Hashes for streamz_latest-0.3.0.tar.gz
Algorithm Hash digest
SHA256 4b6b5fd32c3b81695f053e75bbe29ea175223788815ed5e0ef0f082361167490
MD5 5f47057683d58ee590d0b784f995e61d
BLAKE2b-256 fbab4e720307ad1f4659013d851d1e8e4e18035b02377865303568b91ccda6f7

See more details on using hashes here.

File details

Details for the file streamz_latest-0.3.0-py2.py3-none-any.whl.

File metadata

  • Download URL: streamz_latest-0.3.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 47.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.19.5 CPython/3.6.5

File hashes

Hashes for streamz_latest-0.3.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 06b33495015b3a4e3d19d7cf29b598eedbd667eb09b665188316a9991784e5b0
MD5 03723a160a2314e513cbd3ca334b6836
BLAKE2b-256 7e2b52f951879759c5d496725c99b45081ed451bde029ba01a2475c861ec8d46

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page