convtools

convtools is a python library to declaratively define fast conversions from python objects to python objects, including processing collections and doing complex aggregations.

These details have not been verified by PyPI

Project links

Project description

convtools is a python library to declaratively define fast conversions from python objects to python objects, including processing collections and doing complex aggregations.

https://img.shields.io/pypi/pyversions/convtools.svg

https://img.shields.io/github/license/itechart-almakov/convtools.svg

https://codecov.io/gh/itechart-almakov/convtools/branch/master/graph/badge.svg

https://img.shields.io/github/tag/itechart-almakov/convtools.svg

Description

The speed of convtools comes from the approach of generating code & compiling conversion functions, which don’t have any generic code like superfluous loops, ifs, etc.

So you can follow the DRY principle by storing and reusing the code on the python expression level, but at the same time be able to run the gen_converter and get the compiled code which doesn’t care about being DRY and is generated to be highly specialized for the specific need.

Thanks to pipes & labels it’s possible to define multiple pipelines of data processing, including branching and merging of them.

Conversions are not limited to simple data transformations, there are GroupBy & Aggregate conversions with many useful reducers:

from common Sum, Max

and less widely supported First/Last, Array/ArrayDistinct

to DictSum-like ones (for nested aggregation) and MaxRow/MinRow (for finding an object with max/min value and further processing)

Every conversion:

contains the information of how to transform an input
can be piped into another conversion (same as wrapping)
can be labeled to be reused further in the conversions chain
has a method gen_converter returning a function compiled at runtime
despite being compiled at runtime, is debuggable with pdb due to linecache populating.

Installation:

pip install convtools

An example:

import re
from itertools import chain

# the suggested way of importing convtolls
from convtools import conversion as c

# Let's say we need to count words across all description strings
input_data = [
    "war-and-peace-1.txt",
    "war-and-peace-2.txt",
    "war-and-peace-3.txt",
    "war-and-peace-4.txt",
]
def read_file(filename):
    with open(filename) as f:
        for line in f:
            yield line

# iterate an input and take all descriptions by key
extract_strings = c.generator_comp(
    c.call_func(read_file, c.this())
)


# 1. make ``re`` pattern available to the code to be generated
# 2. call ``finditer`` method of the pattern and pass the string
#    as an argument
# 3. pass the result to the next conversion
# 4. iterate results, call ``.group()`` method of each re.Match
#    and call ``.lower()`` on each result
split_words = (
    c.naive(re.compile(r'\w+')).call_method("finditer", c.this())
    .pipe(
        c.generator_comp(
            c.this().call_method("group").call_method("lower")
        )
    )
)

# ``extract_descriptions`` is the generator of strings
# so we iterate it and pass each item to ``split_words`` conversion
vectorized_split_words = c.generator_comp(
    c.this().pipe(
        split_words
    )
)

# flattening the result of ``vectorized_split_words``, which is
# a generator of generators of strings
flatten = c.call_func(
    chain.from_iterable,
    c.this(),
)

# aggregate the input, the result is a single dict
# words are keys, values are count of words
dict_word_to_count = c.aggregate(
    c.reduce(
        c.ReduceFuncs.DictCount,
        (c.this(), c.this())
    )
)

# take top N words by:
#  - call ``.items()`` method of the dict (the result of the aggregate)
#  - pass the result to ``sorted``
#  - take the slice, using input argument named ``top_n``
#  - cast to a dict
take_top_n = (
    c.this().call_method("items")
    .pipe(sorted, key=lambda t: t[1], reverse=True)
    .pipe(c.this()[:c.input_arg("top_n")])
    .as_type(dict)
)

# the resulting pipeline is pretty self-descriptive, except the ``c.if_``
# part, which checks the condition (first argument),
# and returns the 2nd if True OR the 3rd (input data by default) otherwise
pipeline = (
    extract_strings
    .pipe(flatten)
    .pipe(vectorized_split_words)
    .pipe(flatten)
    .pipe(dict_word_to_count)
    .pipe(
        c.if_(
            c.input_arg("top_n").is_not(None),
            c.this().pipe(take_top_n),
        )
    )
# Define the resulting converter function signature.
# In fact this isn't necessary if you don't need to specify default values
).gen_converter(debug=True, signature="data_, top_n=None")

# check the speed yourself :)
# e.g. take a book in txt format and tune the ``extract_descriptions``
# conversion as needed
pipeline(input_data, top_n=3)

Documentation

convtools on Read the Docs

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.17.0

Jan 20, 2026

1.16.0

Jan 20, 2026

1.15.0

Jan 18, 2026

1.14.8

Oct 22, 2025

1.14.7

Sep 21, 2025

1.14.6

Aug 10, 2025

1.14.5

Jul 30, 2025

1.14.4

Mar 18, 2025

1.14.3

Sep 16, 2024

1.14.2

Sep 4, 2024

1.14.1

Sep 1, 2024

1.14.0

Sep 1, 2024

1.13.2

Aug 29, 2024

1.13.1

Aug 27, 2024

1.13.0 yanked

Aug 22, 2024

Reason this release was yanked:

aggregate CSE optimizer is bugged

1.12.2

Aug 22, 2024

1.12.1

Jul 14, 2024

1.11.0

Jun 30, 2024

1.10.1

Jun 11, 2024

1.10.0

Jun 7, 2024

1.9.0

Jun 5, 2024

1.8.0

Mar 6, 2024

1.7.0

Feb 21, 2024

1.6.0

Feb 6, 2024

1.5.1

Oct 8, 2023

1.5.0

Jul 30, 2023

1.4.0

Jun 28, 2023

1.3.0

Apr 11, 2023

1.2.0

Apr 10, 2023

1.1.2

Mar 27, 2023

1.1.1

Mar 27, 2023

1.1.0

Mar 24, 2023

1.0.0

Mar 23, 2023

0.42.4

Mar 22, 2023

0.42.3

Mar 15, 2023

0.42.2

Mar 14, 2023

0.42.1

Feb 20, 2023

0.42.0

Jan 29, 2023

0.41.0

Jan 16, 2023

0.40.2

Dec 25, 2022

0.40.1

Dec 19, 2022

0.40.0

Dec 18, 2022

0.39.0

Dec 6, 2022

0.38.0

Oct 26, 2022

0.37.0

Sep 29, 2022

0.36.0

Sep 20, 2022

0.35.0

Sep 18, 2022

0.34.0

Jul 26, 2022

0.33.2

Jul 22, 2022

0.33.1

Jul 14, 2022

0.33.0

Jul 14, 2022

0.32.0

Jul 12, 2022

0.31.0

Jul 6, 2022

0.30.0

Jul 6, 2022

0.29.0

Jul 5, 2022

0.28.0

Jul 2, 2022

0.27.0

Jul 1, 2022

0.26.0

Jun 30, 2022

0.25.2

Jun 24, 2022

0.25.1

Jun 24, 2022

0.25.0

Jun 22, 2022

0.24.1

May 29, 2022

0.24.0

May 29, 2022

0.23.3

Mar 11, 2022

0.23.2

Mar 10, 2022

0.23.1

Feb 23, 2022

0.23.0

Feb 22, 2022

0.22.0

Jan 2, 2022

0.21.0

Dec 19, 2021

0.20.2

Dec 2, 2021

0.20.1

Nov 29, 2021

0.20.0 yanked

Nov 28, 2021

Reason this release was yanked:

inconsistent API

0.19.0

Oct 28, 2021

0.18.0

Oct 24, 2021

0.17.0

Oct 14, 2021

0.16.0

Oct 12, 2021

0.15.4

Sep 23, 2021

0.15.3

Sep 19, 2021

0.15.2

Sep 17, 2021

0.15.1

Aug 8, 2021

0.15.0

Aug 2, 2021

0.14.1

Jul 12, 2021

0.14.0

Jun 27, 2021

0.13.4

Jun 20, 2021

0.13.3

Jun 14, 2021

0.13.2

May 27, 2021

0.13.1

May 23, 2021

0.13.0

May 16, 2021

0.12.1

May 13, 2021

0.12.0

May 10, 2021

0.11.2

May 8, 2021

0.11.1

May 7, 2021

0.11.0

May 6, 2021

0.10.0

Apr 28, 2021

0.9.4

Apr 27, 2021

0.9.3

Apr 11, 2021

0.9.2

Mar 28, 2021

0.9.1

Mar 28, 2021

0.9.0

Mar 24, 2021

0.8.0

Jan 3, 2021

0.7.2

Nov 12, 2020

0.7.1

Jul 12, 2020

0.7.0

Jun 14, 2020

0.6.1

May 18, 2020

0.6.0

May 17, 2020

0.5.3

Mar 30, 2020

0.5.2

Mar 29, 2020

0.5.1

Mar 26, 2020

0.5.0

Mar 23, 2020

0.4.0

Mar 19, 2020

0.3.3

Mar 6, 2020

0.3.2

Mar 5, 2020

This version

0.3.1

Mar 5, 2020

0.3.0

Mar 1, 2020

0.2.3

Feb 27, 2020

0.2.2

Feb 25, 2020

0.2.1

Feb 24, 2020

0.2.0

Feb 23, 2020

0.1.1

Feb 18, 2020

0.1.0

Feb 18, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

convtools-0.3.1.tar.gz (62.6 kB view details)

Uploaded Mar 5, 2020 Source

File details

Details for the file convtools-0.3.1.tar.gz.

File metadata

Download URL: convtools-0.3.1.tar.gz
Upload date: Mar 5, 2020
Size: 62.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.6

File hashes

Hashes for convtools-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`d7ba3438463ae9c4cb8c42aab715c780cae77aa703b66f44ae44b430d75a1e26`
MD5	`b01475c98545dd9823ae283ad649aba4`
BLAKE2b-256	`7066aedf6432579d31f5df5bcd524c0365e3e3617a635a5340047f011f23322a`

See more details on using hashes here.

convtools 0.3.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Description

Installation:

An example:

Documentation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes