Stuff to do with counters, sequences and iterables.
Project description
Stuff to do with counters, sequences and iterables.
Latest release 20251231.1: range: support range[type] returning a GenericAlias.
Note that any function accepting an iterable will consume some or all of the derived iterator in the course of its function.
Short summary:
ClonedIterator: A thread safe clone of some orginal iterator.common_prefix_length: Return the length of the common prefix of sequencesseqs.common_suffix_length: Return the length of the common suffix of sequencesseqs.first: Return the first item from an iterable; raiseIndexErroron empty iterables.get0: Return first element of an iterable, or the default.greedy: A decorator or function for greedy computation of iterables.imerge: Merge an iterable of ordered iterables in order.infill: A generator accepting an iterable of objects which yields(obj,missing_keys)2-tuples indicating missing records requiring infill for each object.infill_from_batches: A batched version ofinfill(objs)accepting an iterable of batches of objects which yields(obj,obj_key)2-tuples indicating missing records requiring infill for each object.isordered: Test whether an iterable is ordered. Note that the iterable is iterated, so this is a destructive test for nonsequences.last: Return the last item from an iterable; raiseIndexErroron empty iterables.not_none: Filter the iterables for items which are notNone.onetomany: A decorator for a method of a sequence to merge the results of passing every element of the sequence to the function, expecting multiple values back.onetoone: A decorator for a method of a sequence to merge the results of passing every element of the sequence to the function, expecting a single value back.range: A class like the builtinrangeexceppt that it will accept...as the stop value, indicating an unbound range. Note that if initialised like a normal range it returns a builtinrangeinstance.Seq: A numeric sequence implemented as a thread safe wrapper foritertools.count().seq: Return a new sequential value.skip_map: A version ofmap()which will skip items wherefunc(item)raises an exception inexcept_types, a tuple of exception types. If a skipped exception occurs a warning will be issued unlessquietis true (defaultFalse).splitoff: Split a sequence into (usually short) prefixes and a tail, for example to construct subdirectory trees based on a UUID.StatefulIterator: A trivial iterator which wraps another iterator to expose some tracking state.tee: A generator yielding the items from an iterable which also copies those items to a series of queues.the: Returns the first element of an iterable, but requires there to be exactly one.TrackingCounter: A wrapper for a counter which can be incremented and decremented.unrepeated: A generator yielding items from the iterableitwith no repetitions.
Module contents:
-
class ClonedIterator(collections.abc.Iterable, typing.Generic): A thread safe clone of some orginal iterator.next()of this yields the next item from the supplied iterator.iter()of this returns a generator yielding from the historic items and then from the original iterator.Note that this accrues all of the items from the original iterator in memory.
ClonedIterator.__init__(self, it: Iterable):
Initialise the clone with the iterable it.
ClonedIterator.__iter__(self):
Iterate over the clone, returning a new iterator.
In mild violation of the iterator protocol, instead of
returning self, iter(self) returns a generator yielding
the historic and then current contents of the original iterator.
ClonedIterator.__next__(self):
Return the next item from the original iterator.
-
common_prefix_length(*seqs): Return the length of the common prefix of sequencesseqs. -
common_suffix_length(*seqs): Return the length of the common suffix of sequencesseqs. -
first(iterable): Return the first item from an iterable; raiseIndexErroron empty iterables. -
get0(iterable, default=None): Return first element of an iterable, or the default. -
greedy(g=None, queue_depth=0): A decorator or function for greedy computation of iterables.If
gis omitted or callable this is a decorator for a generator function causing it to compute greedily, capacity limited byqueue_depth.If
gis iterable this function dispatches it in aThreadto compute greedily, capacity limited byqueue_depth.Example with an iterable:
for packet in greedy(parse_data_stream(stream)): ... process packet ...which does some readahead of the stream.
Example as a function decorator:
@greedy def g(n): for item in range(n): yield nThis can also be used directly on an existing iterable:
for item in greedy(range(n)): yield nNormally a generator runs on demand. This function dispatches a
Threadto run the iterable (typically a generator) putting yielded values to a queue and returns a new generator yielding from the queue.The
queue_depthparameter specifies the depth of the queue and therefore how many values the original generator can compute before blocking at the queue's capacity.The default
queue_depthis0which creates aChannelas the queue - a zero storage buffer - which lets the generator compute only a single value ahead of time.A larger
queue_depthallocates aQueuewith that much storage allowing the generator to compute as many asqueue_depth+1values ahead of time.Here's a comparison of the behaviour:
Example without
@greedywhere the "yield 1" step does not occur until after the "got 0":>>> from time import sleep >>> def g(): ... for i in range(2): ... print("yield", i) ... yield i ... print("g done") ... >>> G = g(); sleep(0.1) >>> for i in G: ... print("got", i) ... sleep(0.1) ... yield 0 got 0 yield 1 got 1 g doneExample with
@greedywhere the "yield 1" step computes before the "got 0":>>> from time import sleep >>> @greedy ... def g(): ... for i in range(2): ... print("yield", i) ... yield i ... print("g done") ... >>> G = g(); sleep(0.1) yield 0 >>> for i in G: ... print("got", repr(i)) ... sleep(0.1) ... yield 1 got 0 g done got 1Example with
@greedy(queue_depth=1)where the "yield 1" step computes before the "got 0":>>> from cs.x import X >>> from time import sleep >>> @greedy ... def g(): ... for i in range(3): ... X("Y") ... print("yield", i) ... yield i ... print("g done") ... >>> G = g(); sleep(2) yield 0 yield 1 >>> for i in G: ... print("got", repr(i)) ... sleep(0.1) ... yield 2 got 0 yield 3 got 1 g done got 2 -
imerge(*iters, **kw): Merge an iterable of ordered iterables in order.Parameters:
iters: an iterable of iteratorsreverse: keyword parameter: if true, yield items in reverse order. This requires the iterables themselves to also be in reversed order.
This function relies on the source iterables being ordered and their elements being comparable, through slightly misordered iterables (for example, as extracted from web server logs) will produce only slightly misordered results, as the merging is done on the basis of the front elements of each iterable.
-
infill(objs: Iterable[~_infill_T], *, obj_keys: Callable[[~_infill_T], ~_infill_K], existing_keys: Callable[[~_infill_T], ~_infill_K], all: Optional[bool] = False) -> Iterable[Tuple[~_infill_T, ~_infill_K]]: A generator accepting an iterable of objects which yields(obj,missing_keys)2-tuples indicating missing records requiring infill for each object.Parameters:
objs: an iterable of objectsobj_keys: a callable accepting an object and returning an iterable of the expected keysexistsing_keys: a callable accepting an object and returning an iterable of the existing keysall: optional flag, defaultFalse: if true then yield(obj,())for objects with no missing records
Example:
for obj, missing_key in infill(objs,...): ... infill a record for missing_key ... -
infill_from_batches(objss: Iterable[Iterable[~_infill_T]], *, obj_keys: Callable[[~_infill_T], ~_infill_K], existing_keys: Callable[[~_infill_T], ~_infill_K], all: Optional[bool] = False, amend_batch: Optional[Callable[[Iterable[~_infill_T]], Iterable[~_infill_T]]] = <function <lambda> at 0x10dd61120>): A batched version ofinfill(objs)accepting an iterable of batches of objects which yields(obj,obj_key)2-tuples indicating missing records requiring infill for each object.This is aimed at processing batches of objects where it is more efficient to prepare each batch as a whole, such as a Django
QuerySetwhich lets the caller make single database queries for a batch ofModelinstances. Thus this function can be used withcs.djutils.model_batches_qsfor more efficient infill processing.Parameters:
objss: an iterable of iterables of objectsobj_keys: a callable accepting an object and returning an iterable of the expected keysexistsing_keys: a callable accepting an object and returning an iterable of the existing keysall: optional flag, defaultFalse: if true then yield(obj,())for objects with no missing recordsamend_batch: optional callable to amend the batch of objects, for example to amend aQuerySetwith.select_related()or similar
-
isordered(items, reverse=False, strict=False): Test whether an iterable is ordered. Note that the iterable is iterated, so this is a destructive test for nonsequences. -
last(iterable): Return the last item from an iterable; raiseIndexErroron empty iterables. -
not_none(*iterables): Filter the iterables for items which are notNone. -
onetomany(func): A decorator for a method of a sequence to merge the results of passing every element of the sequence to the function, expecting multiple values back.Example:
class X(list): @onetomany def chars(self, item): return item strs = X(['Abc', 'Def']) all_chars = X.chars() -
onetoone(func): A decorator for a method of a sequence to merge the results of passing every element of the sequence to the function, expecting a single value back.Example:
class X(list): @onetoone def lower(self, item): return item.lower() strs = X(['Abc', 'Def']) lower_strs = X.lower() -
class range: A class like the builtinrangeexceppt that it will accept...as the stop value, indicating an unbound range. Note that if initialised like a normal range it returns a builtinrangeinstance.Examples:
Normal instantiation returns a builin
range:>>> r = range(9) >>> type(r) <class 'range'> >>> r range(0, 9)The basic unbound range:
>>> r0 = range(...) >>> type(r0) <class 'cs.seq.range'> >>> r0 range(0:...:1) >>> str(r0) '0...' >>> list(r0[:10]) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> 99999999999999999 in r0 True >>> -10 in r0 FalseAn unbound instantiation starting at 9:
>>> r = range(9,...) >>> type(r) <class 'cs.seq.range'> >>> r range(9:...:1) >>> r[:10] range(9, 19) >>> r2 = r[:10] >>> type(r2) <class 'range'> >>> r2 range(9, 19) -
class Seq: A numeric sequence implemented as a thread safe wrapper foritertools.count().A
Seqis iterable and both iterating and calling it return the next number in the sequence. -
skip_map(func, *iterables, except_types, quiet=False): A version ofmap()which will skip items wherefunc(item)raises an exception inexcept_types, a tuple of exception types. If a skipped exception occurs a warning will be issued unlessquietis true (defaultFalse). -
splitoff(sq, *sizes): Split a sequence into (usually short) prefixes and a tail, for example to construct subdirectory trees based on a UUID.Example:
>>> from uuid import UUID >>> uuid = 'd6d9c510-785c-468c-9aa4-b7bda343fb79' >>> uu = UUID(uuid).hex >>> uu 'd6d9c510785c468c9aa4b7bda343fb79' >>> splitoff(uu, 2, 2) ['d6', 'd9', 'c510785c468c9aa4b7bda343fb79'] -
class StatefulIterator: A trivial iterator which wraps another iterator to expose some tracking state.This has 2 attributes:
.it: the internal iterator which should yield(item,new_state).state: the last state value from the internal iterator
The originating use case is resuse of an iterator by independent calls that are typically sequential, specificly the
.readmethod of file like objects. Naive sequential reads require the underlying storage to locate the data on every call, even though the previous call has just performed this task for the previous read. Saving the iterator used from the preceeding call allows the iterator to pick up directly if the file offset hasn't been modified in the meantime. -
tee(iterable, *Qs): A generator yielding the items from an iterable which also copies those items to a series of queues.Parameters:
iterable: the iterable to copyQs: the queues, objects accepting a.putmethod.
Note: the item is
.putonto every queue before being yielded from this generator. -
the(iterable, context=None): Returns the first element of an iterable, but requires there to be exactly one. -
class TrackingCounter: A wrapper for a counter which can be incremented and decremented.A facility is provided to wait for the counter to reach a specific value. The .inc and .dec methods also accept a
tagargument to keep individual counts based on the tag to aid debugging.TODO: add
strictoption to error and abort if any counter tries to go below zero.
TrackingCounter.__init__(self, value=0, name=None, lock=None):
Initialise the counter to value (default 0) with the optional name.
TrackingCounter.check(self):
Internal consistency check.
TrackingCounter.dec(self, tag=None):
Decrement the counter.
Wake up any threads waiting for its new value.
TrackingCounter.inc(self, tag=None):
Increment the counter.
Wake up any threads waiting for its new value.
TrackingCounter.wait(self, value):
Wait for the counter to reach the specified value.
-
unrepeated(it, seen=None, signature=None): A generator yielding items from the iterableitwith no repetitions.Parameters:
it: the iterable to processseen: an optional setlike container supportinginand.add()signature: an optional signature function for items fromitwhich produces the value to compare to recognise repeated items; its values are stored in theseenset
The default
signaturefunction is equality; the items are stored nseenand compared. This requires the items to be hashable and support equality tests. The same applies to whatever values thesignaturefunction produces.Another common signature is identity:
id, useful for traversing a graph which may have cycles.Since
seenaccrues all the signature values for yielded items generally it will grow monotonicly as iteration proceeeds. If the items are complex or large it is well worth providing a signature function even if the items themselves can be used in a set.
Release Log
Release 20251231.1: range: support range[type] returning a GenericAlias.
Release 20251231: range: support making a range like range[1::2] for the odd numbers.
Release 20251230:
- New not_none(*iterables) generator yielding items which are not none, handy in * expansions.
- New range() class accepting ... for the stop.
Release 20250914: ClonedIterator: it may be an Iterable, not just an Iterator.
Release 20250801: ClonedIterator: bugfix direct iteration, noticed by @Matiss.
Release 20250724: New ClonedIterator class to provide a reiterable clone of an iterator.
Release 20250306: New infill() and infill_from_batches() generators for identifying missing records requiring an infill.
Release 20250103: New skip_map(func, *iterables, except_types, quiet=False) generator function, like map() but skipping certain exceptions.
Release 20221118: Small doc improvement.
Release 20220530: Seq: calling a Seq is like next(seq).
Release 20210924: New greedy(iterable) or @greedy(generator_function) to let generators precompute.
Release 20210913: New unrepeated() generator removing duplicates from an iterable.
Release 20201025: New splitoff() function to split a sequence into (usually short) prefixes and a tail.
Release 20200914: New common_prefix_length and common_suffix_length for comparing prefixes and suffixes of sequences.
Release 20190103: Documentation update.
Release 20190101:
- New and UNTESTED class StatefulIterator to associate some externally visible state with an iterator.
- Seq: accept optional
lockparameter.
Release 20171231:
- Python 2 backport for imerge().
- New tee function to duplicate an iterable to queues.
- Function isordered() is now a test instead of an assertion.
- Drop NamedTuple, NamedTupleClassFactory (unused).
Release 20160918:
- New function isordered() to test ordering of a sequence.
- imerge: accept new
reverseparameter for merging reversed iterables.
Release 20160828: Modify DISTINFO to say "install_requires", fixes pypi requirements.
Release 20160827: TrackingCounter: accept presupplied lock object. Python 3 exec fix.
Release 20150118: metadata update
Release 20150111: Initial PyPI release.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cs_seq-20251231.1.tar.gz.
File metadata
- Download URL: cs_seq-20251231.1.tar.gz
- Upload date:
- Size: 14.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ed3d4b1abf9c83305da0f1a169511f6bcd071139c5c878b4fb8263b144da0bf
|
|
| MD5 |
edc4c1225c14d5d9758cb50f47661fc1
|
|
| BLAKE2b-256 |
b4c52209304167e877fc9357c245b5ad6266c757e9716da14377116cae2f1d70
|
File details
Details for the file cs_seq-20251231.1-py3-none-any.whl.
File metadata
- Download URL: cs_seq-20251231.1-py3-none-any.whl
- Upload date:
- Size: 14.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e22b0ff04bbb38c6d941a26c30155d35f8bfe8e1842ac5ab3d282b2c8613929
|
|
| MD5 |
febc63a837c8dfcf9c9ae153ecb30237
|
|
| BLAKE2b-256 |
7f22b8f4bd78f46bf772b13b997a8fdaa042426ccfbb610d23b4455ed0f0bbfd
|