Skip to main content

A library for transparent transformation of indexable containers (lists, etc.)

Project description

PyPi package CircleCI Continuous integration Documentation Code quality analysis Tests coverage Citable paper

SeqTools

SeqTools facilitates the manipulation of datasets and the evaluation of a transformation pipeline. Some of the provided functionalities include: mapping element-wise operations, reordering, reindexing, concatenation, joining, slicing, minibatching, etc.

To improve ease of use, SeqTools manipulates list-like objects, otherwise known as a sequences (objects with a length supporting integer or slice based indexing).

Manipulating a dataset as a whole can be slow and resource/memory intensive. To circumvent this issue, SeqTools implements on-demand evaluation under the hood: operations and transformations on a dataset are only applied to individual items when they are actually accessed. This is particularly convenient for prototyping.

When comes the transition from prototyping to execution, the list-like container interface facilitates serial evaluation. Besides, SeqTools also provides simple helpers to dispatch work between multiple workers (threads or processes).

SeqTools originally targets data science, more precisely the data preprocessing stages. Being aware of the experimental nature of this usage, on-demand execution is made as transparent as possible to users by providing fault-tolerant functions and insightful error reporting. Moreover, internal code is kept concise and clear with comments to facilitate error tracing through a failing transformation pipeline.

Example

>>> def f1(x):
... return x + 1
...
>>> def f2(x): # slow and memory heavy transformation
... time.sleep(.01)
... return [x for _ in range(500)]
...
>>> def f3(x):
... return sum(x) / len(x)
...
>>> data = list(range(1000))

Without delayed evaluation, defining the pipeline and reading values looks like so:

>>> tmp1 = [f1(x) for x in data]
>>> tmp2 = [f2(x) for x in tmp1] # takes 10 seconds and a lot of memory
>>> res = [f3(x) for x in tmp2]
>>> print(res[2])
3.0
>>> print(max(tmp2[2])) # requires to store 499 500 useless values along
3

With seqtools:

>>> tmp1 = seqtools.smap(f1, data)
>>> tmp2 = seqtools.smap(f2, tmp1)
>>> res = seqtools.smap(f3, tmp2) # no computations so far
>>> print(res[2]) # takes 0.01 seconds
3.0
>>> print(max(tmp2[2])) # easy access to intermediate results
3

Batteries included!

The library comes with a set of functions to manipulate sequences:

concatenate
batch
gather
prefetch
interleaving

and others (suggestions are also welcome).

Installation

pip install seqtools

Documentation

The documentation is hosted at https://seqtools-doc.readthedocs.io.

Contributing and Support

Use the issue tracker to request features, propose improvements or report issues. For questions regarding usage, please send an email.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

SeqTools-1.1.0-cp39-cp39-manylinux2014_x86_64.whl (45.6 kB view details)

Uploaded CPython 3.9

SeqTools-1.1.0-cp39-cp39-manylinux1_x86_64.whl (45.6 kB view details)

Uploaded CPython 3.9

SeqTools-1.1.0-cp38-cp38-manylinux2014_x86_64.whl (46.3 kB view details)

Uploaded CPython 3.8

SeqTools-1.1.0-cp38-cp38-manylinux1_x86_64.whl (46.3 kB view details)

Uploaded CPython 3.8

SeqTools-1.1.0-cp37-cp37m-manylinux2014_x86_64.whl (46.2 kB view details)

Uploaded CPython 3.7m

SeqTools-1.1.0-cp37-cp37m-manylinux1_x86_64.whl (46.2 kB view details)

Uploaded CPython 3.7m

SeqTools-1.1.0-cp36-cp36m-manylinux2014_x86_64.whl (45.3 kB view details)

Uploaded CPython 3.6m

SeqTools-1.1.0-cp36-cp36m-manylinux1_x86_64.whl (45.3 kB view details)

Uploaded CPython 3.6m

File details

Details for the file SeqTools-1.1.0-cp39-cp39-manylinux2014_x86_64.whl.

File metadata

  • Download URL: SeqTools-1.1.0-cp39-cp39-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 45.6 kB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.9.1

File hashes

Hashes for SeqTools-1.1.0-cp39-cp39-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7c96053dd378a2d4854e44be235d0f8f9f90c6a1ce38422caf3dcbd5daa5deea
MD5 9f0292e260c53626499aa345f70b9832
BLAKE2b-256 037bc9af0b83fe46bbacbcb4f18a2c55b189b12eb07866dc8e8cdc621ac2d285

See more details on using hashes here.

File details

Details for the file SeqTools-1.1.0-cp39-cp39-manylinux1_x86_64.whl.

File metadata

  • Download URL: SeqTools-1.1.0-cp39-cp39-manylinux1_x86_64.whl
  • Upload date:
  • Size: 45.6 kB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.9.1

File hashes

Hashes for SeqTools-1.1.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 4214d25c7cc007f44cf8354a196c1ac0bb4d7f4e61228bd99a2e91de258b69a9
MD5 84fe949dd65b3f1c36156b845d82a625
BLAKE2b-256 f97b44d51601b7644bdc3f6f1b0c67579b3b0d1b9e0985309cd6234dfd9c7c93

See more details on using hashes here.

File details

Details for the file SeqTools-1.1.0-cp38-cp38-manylinux2014_x86_64.whl.

File metadata

  • Download URL: SeqTools-1.1.0-cp38-cp38-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 46.3 kB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.9.1

File hashes

Hashes for SeqTools-1.1.0-cp38-cp38-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 08b15b259dc8bb1b25ed983ec3867bf5b669f8a1e2e23adaeadd9e2bb1d3638c
MD5 4c2f09077a23208eca297096310208c6
BLAKE2b-256 4bcca513b97cfcf3391aeacfe6d3972280c853a4931ee6b40e75510ff61b6702

See more details on using hashes here.

File details

Details for the file SeqTools-1.1.0-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: SeqTools-1.1.0-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 46.3 kB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.9.1

File hashes

Hashes for SeqTools-1.1.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 abfd88ea67f2cc4732ea9d704e8447715ea07b10f2fdbc8830b463f2ea66bc93
MD5 7a18d6394148750e5bf896a5c0b49cf1
BLAKE2b-256 03a8f32c1c7ae8a721d6aea2abc34365862086f9ff9766fd1de2c8946a89e556

See more details on using hashes here.

File details

Details for the file SeqTools-1.1.0-cp37-cp37m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: SeqTools-1.1.0-cp37-cp37m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 46.2 kB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.9.1

File hashes

Hashes for SeqTools-1.1.0-cp37-cp37m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 48e2b9264942bb3c4ed69659d62f1919725a9d9e6230f216653e8447144fefc3
MD5 8c96d943f3fdf078c8f434740f3544db
BLAKE2b-256 957a373204915a4f35ef69cb853e736794d09117cfe1a95858a45412acfbc68e

See more details on using hashes here.

File details

Details for the file SeqTools-1.1.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: SeqTools-1.1.0-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 46.2 kB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.9.1

File hashes

Hashes for SeqTools-1.1.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 dfd8c7e722d20e5d56d7456ffde4e84cf464e1c2e0e252325def802055534c62
MD5 354616b704f5d72ac80692bb49f3f200
BLAKE2b-256 79195366af2be1e05fbc6aca8e03ad562c89832132101a95285d8aed0dc5f261

See more details on using hashes here.

File details

Details for the file SeqTools-1.1.0-cp36-cp36m-manylinux2014_x86_64.whl.

File metadata

  • Download URL: SeqTools-1.1.0-cp36-cp36m-manylinux2014_x86_64.whl
  • Upload date:
  • Size: 45.3 kB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.9.1

File hashes

Hashes for SeqTools-1.1.0-cp36-cp36m-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f02623bf8f29f5a13c204e052797ce02909b1072d4bec295f3961b1e8feefcd9
MD5 bbec5e683c552832b0cc0d479b2b02ec
BLAKE2b-256 607f7ea561604b7517508b69a8e3f9077ef9506e5378f746c75d8a8269f2c6fb

See more details on using hashes here.

File details

Details for the file SeqTools-1.1.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: SeqTools-1.1.0-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 45.3 kB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.1.0 requests-toolbelt/0.9.1 tqdm/4.55.0 CPython/3.9.1

File hashes

Hashes for SeqTools-1.1.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 36564f04e908dcc2d58bf58e9f453e6bb22ed24c6b18b86433f4a8febce229d9
MD5 52f2529d301de645e3288ceaa9c13637
BLAKE2b-256 aeed0e6277e76858c32e262838da23bf21ebf828b17b573567b7d49ded264581

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page