Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

Streamly is a very simple yet powerful wrapper for streams.

Project description

Streamly is a very simple yet powerful wrapper for streams (file-like objects). It is primarily designed to help with the cleaning up of flat data during on the fly read operations.

A typical use case that is especially prevalent within digital marketing, is wanting to download/upload a web stream to some target location that expects clean, flat delimited data but the stream includes unwanted header and footer data. Developers often deal with this by loading the data as-is into an interim location and then opening the file and culling the unwanted leading and trailing lines. This approach works but limitations include: not easily reproducible; increases the complexity of the solution; assumes a storage component; inefficient with large data sets.

Streamly solves this problem by handling the unwanted headers and footers on the fly in a highly efficient manner.

Documentation: https://streamly.readthedocs.io

Installation

Requires Python 3.1+

With pipenv

Install:

pipenv install streamly

OR Update:

pipenv update streamly

With pip

Install & Update:

pip install streamly --upgrade

Example Usage

The below example writes a byte stream to a file, removing the unwanted header and footer details on the fly.

import io

import streamly


my_stream = io.BytesIO(
b"""Header
Metadata
Unwanted
=
Garabage

Report Fields:
col1,col2,col3,col4,col5
data,that,we,actually,want
and,potentially,loads,of,it,
foo,bar,baz,lorem,ipsum
foo,bar,baz,lorem,ipsum
foo,bar,baz,lorem,ipsum
...,...,...,...,...
Grand Total:,0,0,1000,0
More
Footer
Garbage
"""
)

wrapped_stream = streamly.Streamly(my_stream, header_row_identifier=b"Report Fields:\n",
                                   footer_identifier=b"Grand")

data = wrapped_stream.read(50)
while data:
    print(data)
    data = wrapped_stream.read(50)

Features

Includes the following functionality during on the fly read operations:

  • Adjoining of multiple streams
  • Removal of header and footer data, identified by a value (e.g. byte string or string)
  • Logging of read progress
  • Guaranteed read size (where the data is not yet exhausted)
  • Consistent API for streams returning byte strings or strings

Project details


Release history Release notifications

This version

0.3

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for Streamly, version 0.3
Filename, size File type Python version Upload date Hashes
Filename, size Streamly-0.3-py3-none-any.whl (6.7 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size Streamly-0.3.tar.gz (6.5 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page