Python implementation of stream library
Project description
pyStream
Python implementation of stream library. It enables stream processing of protobuf messages; i.e. multiple protobuf messages can be written (read) into (from) a stream by using this library. It can be used for parsing all files encoded by stream library and writing protobuf instances into a file by the same encoding. Refer to the library GitHub page for more information about formatting.
Installation
You can install pyStream using pip
:
pip install pystream-protobuf
Usage
Reading
Here is a sample code to read a file containing a set of protobuf messages (here is a set of VG's Alignment objects, so-called GAM file, defined here). The Alignment class is just an example and it can be any protobuf message. It yields the protobuf objects stored in the file:
import stream
import vg_pb2
alns = [a for a in stream.parse('test.gam', vg_pb2.Alignment)]
Instead of file path, an input stream can be passed to the method parse
:
import stream
import vg_pb2
# ... an already existing file-like object `f` as input stream.
alns = [a for a in stream.parse(f, vg_pb2.Alignment)]
In order to have more control over opening the stream and reading data, the
lower-level method open
can be used:
import stream
import vg_pb2
alns_list = []
with stream.open('test.gam', 'rb') as istream:
for data in istream:
aln = vg_pb2.Alignment()
aln.ParseFromString(data)
alns_list.append(aln)
The stream can be closed by calling close
method explicitly, in which case the
stream is opened without using with
statement (see more examples in the test
package).
The method open
is not restricted to files, as it can be used for any binary
stream. It can be done by passing file object rather than file name to method
open
or directly to Stream
class:
# ... an already existing file-like object `f` as input stream.
with stream.open(fileobj=f, mode='rb') as istream:
# ... continue using istream
Writing
Multiple protobuf objects can be written into a file or any output stream by
calling dump
function. An example of writing a list of Alignment objects to a
file named test.gam
:
import stream
stream.dump('test.gam', *objects_list, buffer_size=10)
If writing to an existing output stream is desired, the dump
method accepts any
file-like object as output stream:
import stream
# ... an already existing file-like object `f` as input stream.
stream.dump('test.gam', *objects_list, buffer_size=10)
Or using open
method for lower-level control. This example appends a set of
messages to the output stream:
import stream
with stream.open('test.gam', 'ab') as ostream:
ostream.write(*objects_list)
ostream.write(*another_objects_list)
Similar to reading, open
method accepts fileobj
argument for any output
stream and the stream can be closed by explicitly calling close
;
particularly when the stream is opened without using with
statement.
More features
Optional GZip compression
The streams encoded by Stream library is
GZip compressed. The compression can be disabled by passing gzip=False
when
opening an stream.
Buffered write
By default, all protobuf message objects provided on each call are written in a
group of messages (see Stream library for
encoding details). The messages can be buffered and write to the stream in a
group of fixed size whenever possible. The size of the buffer can be set by
keyword argument buffer_size
to open
, dump
methods or when Stream class is
constructed (default size is 0 --- means no buffer).
Grouping message
Messages can be grouped in varied size when writing to a stream by setting
buffer size sufficiently large or infinity (-1) and calling flush
method
of Stream class whenever desired.
Group delimiter
Group of objects can be separated by a delimiter of the choice (or by default
None
) when reading from a stream. Sometimes, it can help to identify the end
of a group which is hidden from the library user by default. This feature can be
enable by setting group_delimiter
to True
when constructing a Stream
instance or opening a stream. The delimiter class can also be specified by
delimiter_cls
.
Development
In case, you work with the source code and need to build the package:
python setup.py build
The proto file in the test module required to be compiled before running test cases. To do so, it is required to have Google protobuf compiler (>=3.0.2) installed. After installing protobuf compiler, run:
make init
to compile proto files required for test module. Then, use nosetests
command
of the setup script to execute test cases:
python setup.py nosetests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pystream-protobuf-1.5.1.tar.gz
.
File metadata
- Download URL: pystream-protobuf-1.5.1.tar.gz
- Upload date:
- Size: 8.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3829147438190b0eacfeee63ccd1524f32a4ad754bf116abca760f184265b26c |
|
MD5 | 505c7621d5dd8216dc0b435a23f3bea9 |
|
BLAKE2b-256 | 72e3ce61dc2ced3cfbfec12e456adf06fa6f30dd0e7509b2e618a94edd09c270 |
File details
Details for the file pystream_protobuf-1.5.1-py2.py3-none-any.whl
.
File metadata
- Download URL: pystream_protobuf-1.5.1-py2.py3-none-any.whl
- Upload date:
- Size: 9.4 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 93b2b82fdb9194ffd61085c2b99298b7b862599c4776547532fd40a417f07a0c |
|
MD5 | 9d7d339d41b64f4037f82dcf97cc0925 |
|
BLAKE2b-256 | 120736b5a9eca6cea4ae1ebdc2b6c5574a55bd034d8c3c93674dc8e65e7cc80b |