Skip to main content

A simple framework that makes it easy to work with data streams in Python.

Project description

Spout
====

Spout is a small and simple framework that makes it easy to work with data
streams in Python. In particular, Spout was designed with the processing and
consumption of live data sources in mind.


How it works
------------

At the heart of Spout is the concept of a Stream (which is defined in an
abstract `Stream` class). This defines the basic operations that can be
performed upon a data stream:

Mapping
The items in one stream can me "mapped" to another stream. This is done by
applying a supplied `Function` to each item in the input stream, to produce
another output stream.

stream.map(Function)

Filtering
The items in a stream can be "filtered", so that the resultant stream only
contains items that match a given criteria. This is done by using a supplied
`Predicate` to test each item in the input stream, and copies it to the output
stream if it passes the test criteria.

stream.filter(Predicate)

Processing (Consuming)
The items in a stream are used in some calculations or functionality that
provides no further output to the stream. This is done by applying the supplied
Operation to each item in the stream.

stream.for_each(Operation)


Usage
-----

To use Spout, you first need to create an input data stream. A data stream is simply an
instantiation of a Stream or any of its children (which can be found in the
streams.py file). The Stream class has been specifically designed so that it
is easy to extend and wrap around currently existing data sources that you might
have, such as files or databases.

Some existing examples of stream data sources can by found in sources.py.

For example, to create a Stream out of the lines in a plain text file:

from spout.sources import FileInputStream
s = FileInputStream("test.txt")

Now that you have your data in a stream, you simply have to process it! This can
be done by creating and using your own Functions, Predicates or Operations
(see above).

For example, to print out all the lines in a text file that start with a digit,
but with the digit stripped, we can create our own Predicate and Function
and pass these to the .filter() and .map() functions:

from spout.sources import FileInputStream
from spout.structs import Function, Predicate
from spout.utils import PrintOperation


class StartsWithDigit(Predicate):
def test(self, obj):
return obj[0].is_digit()


class StripFirstChar(Function):
def apply(self, input):
return input[1:]


s = FileInputStream("test.txt")
s \
.filter(StartsWithDigit()) \
.map(StripFirstChar()) \
.for_each(PrintOperation())


Installation
------------

Spout is available in the Python Package Index (PyPI), and so the easiest way to
install it is through `pip`:

$ pip install spout

However, it is also possible to install the repository from the source, through
the `setup.py` utility:

$ python setup.py install

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spout-0.1.2.tar.gz (5.1 kB view details)

Uploaded Source

File details

Details for the file spout-0.1.2.tar.gz.

File metadata

  • Download URL: spout-0.1.2.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for spout-0.1.2.tar.gz
Algorithm Hash digest
SHA256 223a866152904fb0bb46343044aae0b150757632ebaa60368cf9cf246fe1cb5b
MD5 f557a9840d15b5d927c8836dd693df39
BLAKE2b-256 c5798bde462f25afeb135d18da3f79eb9131f532722f6cc947d35f982849498b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page