Skip to main content

A simple framework that makes it easy to work with data streams in Python.

Project description

Spout is a small and simple framework that makes it easy to work with data streams in Python. In particular, Spout was designed with the processing and consumption of live data sources in mind.

How it works

At the heart of Spout is the concept of a Stream (which is defined in an abstract Stream class). This defines the basic operations that can be performed upon a data stream:

Mapping

The items in one stream can be “mapped” to another stream. This is done by applying a supplied Function to each item in the input stream, to produce another output stream.

stream.map(Function)

Filtering

The items in a stream can be “filtered”, so that the resultant stream only contains items that match a given criteria. This is done by using a supplied Predicate to test each item in the input stream, and copies it to the output stream if it passes the test criteria.

stream.filter(Predicate)

Processing (Consuming)

The items in a stream are used in some calculations or functionality that provides no further output to the stream. This is done by applying the supplied Operation to each item in the stream.

stream.for_each(Operation)

Usage

To use Spout, you first need to create an input data stream. A data stream is simply an instantiation of a Stream or any of its children (which can be found in the streams.py file). The Stream class has been specifically designed so that it is easy to extend and wrap around currently existing data sources that you might have, such as files or databases.

Some existing examples of stream data sources can by found in sources.py.

For example, to create a Stream out of the lines in a plain text file:

from spout.sources import FileInputStream s = FileInputStream(“test.txt”)

Now that you have your data in a stream, you simply have to process it! This can be done by creating and using your own Functions, Predicates or Operations (see above).

For example, to print out all the lines in a text file that start with a digit, but with the digit stripped, we can create our own Predicate and Function and pass these to the .filter() and .map() functions:

from spout.sources import FileInputStream from spout.structs import Function, Predicate from spout.outputs import PrintOperation

class StartsWithDigit(Predicate):
def test(self, obj):

return obj[0].is_digit()

class StripFirstChar(Function):
def apply(self, input):

return input[1:]

s = FileInputStream(“test.txt”) s .filter(StartsWithDigit()) .map(StripFirstChar()) .for_each(PrintOperation())

Installation

Spout is available in the Python Package Index (PyPI), and so the easiest way to install it is through pip:

$ pip install spout

However, it is also possible to install the repository from the source, through the setup.py utility:

$ python setup.py install

Credits

The inspiration for Spout’s fluent interface came largely from the OpenIMAJ streaming framework. http://www.openimaj.org/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spout-0.1.6.tar.gz (5.4 kB view details)

Uploaded Source

File details

Details for the file spout-0.1.6.tar.gz.

File metadata

  • Download URL: spout-0.1.6.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for spout-0.1.6.tar.gz
Algorithm Hash digest
SHA256 6cced780b0b8fb82e5e3aa69fe2acde83a7fbd307d465d1e84da2aab09edc912
MD5 4a676fc663277e469657703cff0fde69
BLAKE2b-256 b103a0879949112775c98648de6d80a80428323def4e354a99c27c91c5947c29

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page