Skip to main content

Streamly is a very simple yet powerful wrapper for streams.

Project description

Streamly is a very simple yet powerful wrapper for streams (file-like objects). It is primarily designed to help with the cleaning up of flat data during on the fly read operations.

A typical use case that is especially prevalent within digital marketing, is wanting to download/upload a web stream to some target location that expects clean, flat delimited data but the stream includes unwanted header and footer data. Developers often deal with this by loading the data as-is into an interim location and then opening the file and culling the unwanted leading and trailing lines. This approach works but limitations include: not easily reproducible; increases the complexity of the solution; assumes a storage component; inefficient with large data sets.

Streamly solves this problem by handling the unwanted headers and footers on the fly in a highly efficient manner.

Documentation: https://streamly.readthedocs.io

Installation

Requires Python 3.1+

With pipenv

Install:

pipenv install streamly

OR Update:

pipenv update streamly

With pip

Install & Update:

pip install streamly --upgrade

Example Usage

The below example writes a byte stream to a file, removing the unwanted header and footer details on the fly.

import io

import streamly


my_stream = io.BytesIO(
b"""Header
Metadata
Unwanted
=
Garabage

Report Fields:
col1,col2,col3,col4,col5
data,that,we,actually,want
and,potentially,loads,of,it,
foo,bar,baz,lorem,ipsum
foo,bar,baz,lorem,ipsum
foo,bar,baz,lorem,ipsum
...,...,...,...,...
Grand Total:,0,0,1000,0
More
Footer
Garbage
"""
)

wrapped_stream = streamly.Streamly(my_stream, header_row_identifier=b"Report Fields:\n",
                                   footer_identifier=b"Grand")

data = wrapped_stream.read(50)
while data:
    print(data)
    data = wrapped_stream.read(50)

Features

Includes the following functionality during on the fly read operations:

  • Adjoining of multiple streams

  • Removal of header and footer data, identified by a value (e.g. byte string or string)

  • Logging of read progress

  • Guaranteed read size (where the data is not yet exhausted)

  • Consistent API for streams returning byte strings or strings

Project details


Release history Release notifications | RSS feed

This version

0.3

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Streamly-0.3.tar.gz (6.5 kB view hashes)

Uploaded Source

Built Distribution

Streamly-0.3-py3-none-any.whl (6.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page