Incrementally decode bytes into strings and lines
Project description
# An incremental decoder of bytes into characters and lines
## The DecodeAccumulator class
The `DecodeAccumulator` class implements an incremental decoder: an object
that may be fed bytes (one or several at a time) as they are e.g. read
from a network stream or a subprocess's output, and that adds to a result
string as soon as enough bytes have been accumulated to produce a character
in the specified encoding.
Note that `DecodeAccumulator` objects are immutable value objects:
the `add()` method does not modify its invocant, but returns a new
`DecodeAccumulator` object instead.
Sample usage:
while True:
bb = subprocess.stdout.read(1024)
if len(bb) == 0:
break
acc = acc.add(bb)
assert(not acc.done)
if acc.splitter.lines:
# at least one full line was produced
(acc, lines) = acc.pop_lines()
print('\n'.join(lines)
if acc.buf:
print('Leftover bytes left in the buffer!', file=sys.stderr)
if acc.splitter.buf:
print('Incomplete line: ' + acc.splitter.buf)
final = acc.add(None)
assert(final.splitter.buf == '')
assert(final.splitter.done)
assert(final.done)
if acc.splitter.buf:
assert(len(final.splitter.lines) == len(acc.splitter.lines) + 1)
## The splitter classes: UniversalNewlines, FixedEOLSplitter, NullSplitter
The `decode_acc.newlines` module provides three classes that may be used to
split a text string into lines in different ways. The `UniversalNewlines`
class does its best to simulate the "universal newlines" behavior of `file`
objects. The `FixedEOLSplitter` class uses a specified string as a line
terminator to split on. The `NullSplitter` class does not do any splitting.
Sample usage:
spl = newlines.UniversalNewlines()
for char in input_string:
spl = spl.add(char)
spl.add(None)
for (idx, line) in enumerate(spl.lines):
print('line {idx}: {line}'.format(idx=idx, line=line))
## The DecodeAccumulator class
The `DecodeAccumulator` class implements an incremental decoder: an object
that may be fed bytes (one or several at a time) as they are e.g. read
from a network stream or a subprocess's output, and that adds to a result
string as soon as enough bytes have been accumulated to produce a character
in the specified encoding.
Note that `DecodeAccumulator` objects are immutable value objects:
the `add()` method does not modify its invocant, but returns a new
`DecodeAccumulator` object instead.
Sample usage:
while True:
bb = subprocess.stdout.read(1024)
if len(bb) == 0:
break
acc = acc.add(bb)
assert(not acc.done)
if acc.splitter.lines:
# at least one full line was produced
(acc, lines) = acc.pop_lines()
print('\n'.join(lines)
if acc.buf:
print('Leftover bytes left in the buffer!', file=sys.stderr)
if acc.splitter.buf:
print('Incomplete line: ' + acc.splitter.buf)
final = acc.add(None)
assert(final.splitter.buf == '')
assert(final.splitter.done)
assert(final.done)
if acc.splitter.buf:
assert(len(final.splitter.lines) == len(acc.splitter.lines) + 1)
## The splitter classes: UniversalNewlines, FixedEOLSplitter, NullSplitter
The `decode_acc.newlines` module provides three classes that may be used to
split a text string into lines in different ways. The `UniversalNewlines`
class does its best to simulate the "universal newlines" behavior of `file`
objects. The `FixedEOLSplitter` class uses a specified string as a line
terminator to split on. The `NullSplitter` class does not do any splitting.
Sample usage:
spl = newlines.UniversalNewlines()
for char in input_string:
spl = spl.add(char)
spl.add(None)
for (idx, line) in enumerate(spl.lines):
print('line {idx}: {line}'.format(idx=idx, line=line))
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
decode-acc-0.1.0.tar.gz
(5.8 kB
view hashes)
Built Distributions
Close
Hashes for decode_acc-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4aad594b6dab1ddf5ed41d3dd22bdb4aa45aac320939a83b32db9cb8f3873b0f |
|
MD5 | 5a16a2696f8a60e03af6f4dac253c653 |
|
BLAKE2b-256 | 1d2830393c81a64189bd63bcd5566267b8ec3dca9252fd83336fd707ee7d7b12 |
Close
Hashes for decode_acc-0.1.0-py2-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 24bd17908aa33eb754d87918eab238a834a2010549ec34944f88ef2036d45a18 |
|
MD5 | 2ca0d27594e7c45f72cf01cde983df29 |
|
BLAKE2b-256 | 196783adb9e4fdb13d90e5e3eb5d62b6a126f8188a2600bfd79375a8040f483d |