Skip to main content

A library for transparent transformation of indexable containers (lists, etc.)

Project description

PyPi package CircleCI Continuous integration Documentation Code quality analysis Tests coverage Citable paper

SeqTools

SeqTools extends the functionalities of itertools to indexable (list-like) objects. Some of the provided functionalities include: element-wise function mapping, reordering, reindexing, concatenation, joining, slicing, minibatching, etc.

SeqTools functions implement on-demand evaluation under the hood: operations and transformations are only applied to individual items when they are actually accessed. A simple but powerful prefetch function is also provided to quickly evaluate elements.

SeqTools originally targets data science, more precisely the data preprocessing stages. Being aware of the experimental nature of this usage, on-demand execution is made as transparent as possible by providing fault-tolerant functions and insightful error message.

Example

Example

>>> def f1(x):
...     return x + 1
...
>>> def f2(x):  # slow and memory heavy transformation
...     time.sleep(.01)
...     return [x for _ in range(500)]
...
>>> def f3(x):
...     return sum(x) / len(x)
...
>>> data = list(range(1000))

Without seqtools, defining the pipeline and reading values looks like so:

>>> tmp1 = [f1(x) for x in data]
>>> tmp2 = [f2(x) for x in tmp1]  # takes 10 seconds and a lot of memory
>>> res = [f3(x) for x in tmp2]
>>> print(res[2])
3.0
>>> print(max(tmp2[2]))  # requires to store 499 500 useless values along
3

With seqtools:

>>> tmp1 = seqtools.smap(f1, data)
>>> tmp2 = seqtools.smap(f2, tmp1)
>>> res = seqtools.smap(f3, tmp2)  # no computations so far
>>> print(res[2])  # takes 0.01 seconds
3.0
>>> print(max(tmp2[2]))  # easy access to intermediate results
3

Batteries included!

The library comes with a set of functions to manipulate sequences:

concatenate
batch
gather
prefetch
interleaving
uniter

and others (suggestions are also welcome).

Installation

pip install seqtools

Documentation

The documentation is hosted at https://seqtools-doc.readthedocs.io.

Contributing and Support

Use the issue tracker to request features, propose improvements or report issues. For questions regarding usage, please send an email.

Project details


Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page