Facilities to do with buffers, particularly CornuCopyBuffer, an automatically refilling buffer to support parsing of data streams.
Project description
Latest release 20191230: CornuCopyBuffer: accept a size of Ellipsis in .take and .extend methods, indicating "all the remaining data". CornuCopyBuffer: refactor the buffering, replacing .buf with .bufs as an array of chunks; this enables support for the new .push method and reduces memory copying.
Facilities to do with buffers, particularly CornuCopyBuffer, an automatically refilling buffer to support parsing of data streams.
Function chunky(bfr_func)
Decorator for a function accepting a leading CornuCopyBuffer
parameter.
Returns a function accepting a leading data chunks parameter
(bytes instances) and optional offset
and 'copy_offsets`
keywords parameters.
Example::
@chunky
def func(bfr, ...):
Class CopyingIterator
Wrapper for an iterator that copies every item retrieved to a callable.
Method CopyingIterator.__init__(self, I, copy_to)
Initialise with the iterator I
and the callable copy_to
.
Class CornuCopyBuffer
An automatically refilling buffer intended to support parsing of data streams.
Attributes:
buf
: a buffer of unparsed data from the input, available for direct inspection by parsersoffset
: the logical offset of the buffer; this excludes unconsumed input data and.buf
The primary methods supporting parsing of data streams are
extend() and take(). Calling .extend(min_size)
arranges
that .buf
contains at least min_size
bytes. Calling .take(size)
fetches exactly size
bytes from .buf
and the input source if
necessary and returns them, adjusting .buf
.
len(CornuCopyBuffer) returns the length of .buf
.
bool(CornuCopyBuffer) tests whether len() > 0.
Indexing a CornuCopyBuffer accesses .buf
.
A CornuCopyBuffer is also iterable, yielding data in whatever
sizes come from its input_data
source, preceeded by the
current .buf
if not empty.
A CornuCopyBuffer also supports the file methods .read
,
.tell
and .seek
supporting drop in use of the buffer in
many file contexts. Backward seeks are not supported. .seek
will take advantage of the input_data
's .seek method if it
has one, otherwise it will use reads.
Method CornuCopyBuffer.__init__(self, input_data, buf=None, offset=0, seekable=None, copy_offsets=None, copy_chunks=None)
Prepare the buffer.
Parameters:
input_data
: an iterable of data chunks (bytes instances); if your data source is a file see the .from_file factory; if your data source is a file descriptor see the .from_fd factory.buf
: if not None, the initial state of the parse bufferoffset
: logical offset of the start of the buffer, default 0seekable
: whetherinput_data
has a working.seek
method; the default is None meaning that it will be attempted on the first skip or seekcopy_offsets
: if not None, a callable for parsers to report pertinent offsets via the buffer's .report_offset methodcopy_chunks
: if not None, every fetched data chunk is copied to this callable
The input_data
is an iterable whose iterator may have
some optional additional properties:
seek
: if present, this is a seek method after the fashion offile.seek
; the buffer'sseek
,skip
andskipto
methods will take advantage of this if available.offset
: the current byte offset of the iterator; this is used during the buffer initialisation to computeinput_data_displacement
, the difference between the buffer's logical offset and the input data's logical offset; if unavailable during initialisation this is presumed to be 0.end_offset
: the end offset of the iterator if known.
Class FDIterator
MRO: _Iterator
An iterator over the data of a file descriptor.
Note: the iterator works with an os.dup() of the file descriptor so that it can close it with impunity; this requires the caller to close their descriptor.
Method FDIterator.__init__(self, fd, offset=None, readsize=None, align=True)
Initialise the iterator.
Parameters:
fd
: file descriptoroffset
: the initial logical offset, kept up to date by iteration; the default is the current file position.readsize
: a preferred read size; if omitted thenDEFAULT_READSIZE
will be storedalign
: whether to align reads by default: if true then the iterator will do a short read to bring theoffset
into alignment withreadsize
; the default isTrue
Class FileIterator
MRO: _Iterator
, SeekableIteratorMixin
An iterator over the data of a file object.
Note: the iterator closes the file on __del__
or if its
.close
method is called.
Method FileIterator.__init__(self, fp, offset=None, readsize=None, align=False)
Initialise the iterator.
Parameters:
fp
: file objectoffset
: the initial logical offset, kept up to date by iteration; the default is 0.readsize
: a preferred read size; if omitted thenDEFAULT_READSIZE
will be storedalign
: whether to align reads by default: if true then the iterator will do a short read to bring theoffset
into alignment withreadsize
; the default isFalse
Class SeekableFDIterator
MRO: FDIterator
, _Iterator
, SeekableIteratorMixin
An iterator over the data of a seekable file descriptor.
Note: the iterator works with an os.dup()
of the file
descriptor so that it can close it with impunity; this requires
the caller to close their descriptor.
Class SeekableFileIterator
MRO: FileIterator
, _Iterator
, SeekableIteratorMixin
An iterator over the data of a seekable file object.
Note: the iterator closes the file on del or if its .close method is called.
Method SeekableFileIterator.__init__(self, fp, offset=None, **kw)
Initialise the iterator.
Parameters:
fp
: file objectoffset
: the initial logical offset, kept up to date by iteration; the default is the current file position.readsize
: a preferred read size; if omitted thenDEFAULT_READSIZE
will be storedalign
: whether to align reads by default: if true then the iterator will do a short read to bring theoffset
into alignment withreadsize
; the default isFalse
Class SeekableIteratorMixin
Mixin supplying a logical with a seek
method.
Class SeekableMMapIterator
MRO: _Iterator
, SeekableIteratorMixin
An iterator over the data of a mappable file descriptor.
Note: the iterator works with an mmap
of an os.dup()
of the
file descriptor so that it can close it with impunity; this
requires the caller to close their descriptor.
Method SeekableMMapIterator.__init__(self, fd, offset=None, readsize=None, align=True)
Initialise the iterator.
Parameters:
offset
: the initial logical offset, kept up to date by iteration; the default is the current file position.readsize
: a preferred read size; if omitted thenDEFAULT_READSIZE
will be storedalign
: whether to align reads by default: if true then the iterator will do a short read to bring theoffset
into alignment withreadsize
; the default isTrue
Release Log
Release 20191230: CornuCopyBuffer: accept a size of Ellipsis in .take and .extend methods, indicating "all the remaining data". CornuCopyBuffer: refactor the buffering, replacing .buf with .bufs as an array of chunks; this enables support for the new .push method and reduces memory copying.
Release 20181231: Small bugfix.
Release 20181108: New at_eof() method. Python 2 tweak to support incidental import by python 2 even if unused.
Release 20180823: Better handling of seekable and unseekable input data. Tiny bugfix for from_bytes sanity check.
Release 20180810:
Refactor SeekableFDIterator and SeekableFileIterator to subclass new SeekableIterator.
New SeekableMMapIterator to process a memory mapped file descriptor, intended for large files.
New CornuCopyBuffer.hint method to pass a length hint through to the input_data iterator
if it has a hint
method, causing it possibly to make a differently sized fetch.
SeekableIterator: new del method calling self.close() - subclasses must provide
a .close, which should be safe to call multiple times.
CornuCopyBuffer: add support for .offset and .end_offset optional attributes on the input_data iterator.
_BoundedBufferIterator: add .offset property plumbed to the underlying buffer offset.
New CornuCopyBuffer.from_mmap to make a mmap backed buffer so that large data can be returned without penalty.
Assorted fixes and doc improvements.
Release 20180805: Bugfixes for at_eof method and end_offset initialisation.
Release 20180726.1: Improve docstrings and release with better long_description.
Release 20180726: First PyPI release: CornuCopyBuffer and friends.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.