Incrementally decode bytes into strings and lines
Project description
# An incremental decoder of bytes into characters and lines
## The DecodeAccumulator class
The `DecodeAccumulator` class implements an incremental decoder: an object
that may be fed bytes (one or several at a time) as they are e.g. read
from a network stream or a subprocess's output, and that adds to a result
string as soon as enough bytes have been accumulated to produce a character
in the specified encoding.
Note that `DecodeAccumulator` objects are immutable value objects:
the `add()` method does not modify its invocant, but returns a new
`DecodeAccumulator` object instead.
Sample usage:
while True:
bb = subprocess.stdout.read(1024)
if len(bb) == 0:
break
acc = acc.add(bb)
assert(not acc.done)
if acc.splitter.lines:
# at least one full line was produced
(acc, lines) = acc.pop_lines()
print('\n'.join(lines)
if acc.buf:
print('Leftover bytes left in the buffer!', file=sys.stderr)
if acc.splitter.buf:
print('Incomplete line: ' + acc.splitter.buf)
final = acc.add(None)
assert(final.splitter.buf == '')
assert(final.splitter.done)
assert(final.done)
if acc.splitter.buf:
assert(len(final.splitter.lines) == len(acc.splitter.lines) + 1)
## The splitter classes: UniversalNewlines, FixedEOLSplitter, NullSplitter
The `decode_acc.newlines` module provides three classes that may be used to
split a text string into lines in different ways. The `UniversalNewlines`
class does its best to simulate the "universal newlines" behavior of `file`
objects. The `FixedEOLSplitter` class uses a specified string as a line
terminator to split on. The `NullSplitter` class does not do any splitting.
Sample usage:
spl = newlines.UniversalNewlines()
for char in input_string:
spl = spl.add(char)
spl.add(None)
for (idx, line) in enumerate(spl.lines):
print('line {idx}: {line}'.format(idx=idx, line=line))
## The DecodeAccumulator class
The `DecodeAccumulator` class implements an incremental decoder: an object
that may be fed bytes (one or several at a time) as they are e.g. read
from a network stream or a subprocess's output, and that adds to a result
string as soon as enough bytes have been accumulated to produce a character
in the specified encoding.
Note that `DecodeAccumulator` objects are immutable value objects:
the `add()` method does not modify its invocant, but returns a new
`DecodeAccumulator` object instead.
Sample usage:
while True:
bb = subprocess.stdout.read(1024)
if len(bb) == 0:
break
acc = acc.add(bb)
assert(not acc.done)
if acc.splitter.lines:
# at least one full line was produced
(acc, lines) = acc.pop_lines()
print('\n'.join(lines)
if acc.buf:
print('Leftover bytes left in the buffer!', file=sys.stderr)
if acc.splitter.buf:
print('Incomplete line: ' + acc.splitter.buf)
final = acc.add(None)
assert(final.splitter.buf == '')
assert(final.splitter.done)
assert(final.done)
if acc.splitter.buf:
assert(len(final.splitter.lines) == len(acc.splitter.lines) + 1)
## The splitter classes: UniversalNewlines, FixedEOLSplitter, NullSplitter
The `decode_acc.newlines` module provides three classes that may be used to
split a text string into lines in different ways. The `UniversalNewlines`
class does its best to simulate the "universal newlines" behavior of `file`
objects. The `FixedEOLSplitter` class uses a specified string as a line
terminator to split on. The `NullSplitter` class does not do any splitting.
Sample usage:
spl = newlines.UniversalNewlines()
for char in input_string:
spl = spl.add(char)
spl.add(None)
for (idx, line) in enumerate(spl.lines):
print('line {idx}: {line}'.format(idx=idx, line=line))
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
decode-acc-0.1.1.tar.gz
(5.9 kB
view hashes)
Built Distributions
Close
Hashes for decode_acc-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7c67e1ce8ef07c530c1fe49007ebccabc585a5f257c396a7dc3a650c3f4f5149 |
|
MD5 | 6d223439322f458e3337bf871c782fd5 |
|
BLAKE2b-256 | 082172ce5be6687f12f0d86630519ff9ae6ac9ad7f699c733ff1226a0b959458 |
Close
Hashes for decode_acc-0.1.1-py2-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7bf5ccfb92e385a99478e498ee0f12004166ca2bf521bcdfef9480c8696c14a0 |
|
MD5 | 375f44751b863a8d897f4e648542e3f4 |
|
BLAKE2b-256 | ea64f4a58bc5fdbae125c2ee779760ca8be90be5a13d5f7b5f79d5c88696ec05 |