Mrocklin streamz library. It's the latest version
Project description
Streamz
=======
Streamz helps you build pipelines to manage continuous streams of data. It is
simple to use in simple cases, but also supports complex pipelines that involve
branching, joining, flow control, feedback, back pressure, and so on.
Optionally, Streamz can also work with Pandas dataframes to provide sensible
streaming operations on continuous tabular data.
To learn more about how to use streams, visit :doc:`Core documentation <core>`.
Motivation
----------
Continuous data streams arise in many applications like the following:
1. Log processing from web servers
2. Scientific instrument data like telemetry or image processing pipelines
3. Financial time series
4. Machine learning pipelines for real-time and on-line learning
5. ...
Sometimes these pipelines are very simple, with a linear sequence of processing
steps:
.. image:: docs/source/images/simple.svg
:alt: a simple streamz pipeline
And sometimes these pipelines are more complex, involving branching, look-back
periods, feedback into earlier stages, and more.
.. image:: docs/source/images/complex.svg
:alt: a more complex streamz pipeline
Streamz endeavors to be simple in simple cases, while also being powerful
enough to let you define custom and powerful pipelines for your application.
Why not Python generator expressions?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Python users often manage continuous sequences of data with iterators or
generator expressions.
.. code-block:: python
def fib():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
sequence = (f(n) for n in fib())
However iterators become challenging when you want to fork them or control the
flow of data. Typically people rely on tools like ``itertools.tee``, and
``zip``.
.. code-block:: python
x1, x2 = itertools.tee(x, 2)
y1 = map(f, x1)
y2 = map(g, x2)
However this quickly become cumbersome, especially when building complex
pipelines.
Related Work
------------
Streamz is similar to reactive
programming systems like `RxPY <https://github.com/ReactiveX/RxPY>`_ or big
data streaming systems like `Apache Flink <https://flink.apache.org/>`_,
`Apache Beam <https://beam.apache.org/get-started/quickstart-py/>`_ or
`Apache Spark Streaming <https://beam.apache.org/get-started/quickstart-py/>`_.
=======
Streamz helps you build pipelines to manage continuous streams of data. It is
simple to use in simple cases, but also supports complex pipelines that involve
branching, joining, flow control, feedback, back pressure, and so on.
Optionally, Streamz can also work with Pandas dataframes to provide sensible
streaming operations on continuous tabular data.
To learn more about how to use streams, visit :doc:`Core documentation <core>`.
Motivation
----------
Continuous data streams arise in many applications like the following:
1. Log processing from web servers
2. Scientific instrument data like telemetry or image processing pipelines
3. Financial time series
4. Machine learning pipelines for real-time and on-line learning
5. ...
Sometimes these pipelines are very simple, with a linear sequence of processing
steps:
.. image:: docs/source/images/simple.svg
:alt: a simple streamz pipeline
And sometimes these pipelines are more complex, involving branching, look-back
periods, feedback into earlier stages, and more.
.. image:: docs/source/images/complex.svg
:alt: a more complex streamz pipeline
Streamz endeavors to be simple in simple cases, while also being powerful
enough to let you define custom and powerful pipelines for your application.
Why not Python generator expressions?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Python users often manage continuous sequences of data with iterators or
generator expressions.
.. code-block:: python
def fib():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
sequence = (f(n) for n in fib())
However iterators become challenging when you want to fork them or control the
flow of data. Typically people rely on tools like ``itertools.tee``, and
``zip``.
.. code-block:: python
x1, x2 = itertools.tee(x, 2)
y1 = map(f, x1)
y2 = map(g, x2)
However this quickly become cumbersome, especially when building complex
pipelines.
Related Work
------------
Streamz is similar to reactive
programming systems like `RxPY <https://github.com/ReactiveX/RxPY>`_ or big
data streaming systems like `Apache Flink <https://flink.apache.org/>`_,
`Apache Beam <https://beam.apache.org/get-started/quickstart-py/>`_ or
`Apache Spark Streaming <https://beam.apache.org/get-started/quickstart-py/>`_.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
streamz_latest-0.3.0.tar.gz
(61.4 kB
view details)
Built Distribution
File details
Details for the file streamz_latest-0.3.0.tar.gz
.
File metadata
- Download URL: streamz_latest-0.3.0.tar.gz
- Upload date:
- Size: 61.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.19.5 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b6b5fd32c3b81695f053e75bbe29ea175223788815ed5e0ef0f082361167490 |
|
MD5 | 5f47057683d58ee590d0b784f995e61d |
|
BLAKE2b-256 | fbab4e720307ad1f4659013d851d1e8e4e18035b02377865303568b91ccda6f7 |
File details
Details for the file streamz_latest-0.3.0-py2.py3-none-any.whl
.
File metadata
- Download URL: streamz_latest-0.3.0-py2.py3-none-any.whl
- Upload date:
- Size: 47.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.19.5 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 06b33495015b3a4e3d19d7cf29b598eedbd667eb09b665188316a9991784e5b0 |
|
MD5 | 03723a160a2314e513cbd3ca334b6836 |
|
BLAKE2b-256 | 7e2b52f951879759c5d496725c99b45081ed451bde029ba01a2475c861ec8d46 |