convtools

convtools allows to define and reuse conversions for processing collections and csv tables, complex aggregations and joins.

These details have not been verified by PyPI

Project links

Project description

convtools is a python library to declaratively define data transforms:

convtools.conversion - pipelines for processing collections, doing complex aggregations and joins.
convtools.contrib.tables - stream processing of table-like data (e.g. CSV)

https://img.shields.io/pypi/pyversions/convtools.svg

https://img.shields.io/github/license/westandskif/convtools.svg

https://codecov.io/gh/westandskif/convtools/branch/master/graph/badge.svg

https://img.shields.io/github/tag/westandskif/convtools.svg

Docs

Why would you need this?

you prefer declarative approach
you love functional programming
you believe that Python is high-level enough not to make you write aggregations and joins by hand
you need to serialize/validate objects
you need to dynamically define transforms (including at runtime)
you like the idea of having something write ad hoc code for you :)

Installation:

pip install convtools

Conversions - data transforms, aggregations, joins

# pip install convtools

from convtools import conversion as c

input_data = [{"StoreID": " 123", "Quantity": "123"}]

# define a conversion (sometimes you may want to do this dynamically)
#  takes iterable and returns iterable of dicts, stopping before the first
#  one with quantity >= 1000, splitting into chunks of size = 1000
conversion = (
    c.iter(
        {
            "id": c.item("StoreID").call_method("strip"),
            "quantity": c.item("Quantity").as_type(int),
        }
    )
    .take_while(c.item("quantity") < 1000)
    .pipe(
        c.chunk_by(c.item("id"), size=1000)
    )
    .as_type(list)
    .gen_converter(debug=True)
)

# compile the conversion into an ad hoc function and run it
converter = conversion.gen_converter()
converter(input_data)

# OR in case of a one-shot use
conversion.execute(input_data)

from convtools import conversion as c


def test_doc__index_intro():

    # ======== #
    # GROUP BY #
    # ======== #
    input_data = [
        {"a": 5, "b": "foo"},
        {"a": 10, "b": "foo"},
        {"a": 10, "b": "bar"},
        {"a": 10, "b": "bar"},
        {"a": 20, "b": "bar"},
    ]

    conv = (
        c.group_by(c.item("b"))
        .aggregate(
            {
                "b": c.item("b"),
                "a_first": c.ReduceFuncs.First(c.item("a")),
                "a_max": c.ReduceFuncs.Max(c.item("a")),
            }
        )
        .gen_converter(debug=True)
    )

    assert conv(input_data) == [
        {"b": "foo", "a_first": 5, "a_max": 10},
        {"b": "bar", "a_first": 10, "a_max": 20},
    ]

    # ========= #
    # AGGREGATE #
    # ========= #
    conv = c.aggregate(
        {
            # list of "a" values where "b" equals to "bar"
            "a": c.ReduceFuncs.Array(c.item("a"), where=c.item("b") == "bar"),
            # "b" value of a row where "a" has Max value
            "b": c.ReduceFuncs.MaxRow(
                c.item("a"),
            ).item("b", default=None),
        }
    ).gen_converter(debug=True)

    assert conv(input_data) == {"a": [10, 10, 20], "b": "bar"}

    # ==== #
    # JOIN #
    # ==== #
    collection_1 = [
        {"id": 1, "name": "Nick"},
        {"id": 2, "name": "Joash"},
        {"id": 3, "name": "Bob"},
    ]
    collection_2 = [
        {"ID": "3", "age": 17, "country": "GB"},
        {"ID": "2", "age": 21, "country": "US"},
        {"ID": "1", "age": 18, "country": "CA"},
    ]
    input_data = (collection_1, collection_2)

    conv = (
        c.join(
            c.item(0),
            c.item(1),
            c.and_(
                c.LEFT.item("id") == c.RIGHT.item("ID").as_type(int),
                c.RIGHT.item("age") >= 18,
            ),
            how="left",
        )
        .pipe(
            c.list_comp(
                {
                    "id": c.item(0, "id"),
                    "name": c.item(0, "name"),
                    "age": c.item(1, "age", default=None),
                    "country": c.item(1, "country", default=None),
                }
            )
        )
        .gen_converter(debug=True)
    )

    assert conv(input_data) == [
        {"id": 1, "name": "Nick", "age": 18, "country": "CA"},
        {"id": 2, "name": "Joash", "age": 21, "country": "US"},
        {"id": 3, "name": "Bob", "age": None, "country": None},
    ]

What reducers are supported by aggregations?

Built-in ones, exposed like c.ReduceFuncs.Sum:

Sum
SumOrNone
Max
MaxRow
Min
MinRow
Count
CountDistinct
First
Last
Average
Median
Percentile - c.ReduceFuncs.Percentile(95.0, c.item("x"))
Mode
TopK - c.ReduceFuncs.TopK(3, c.item("x"))
Array
ArrayDistinct
ArraySorted - c.ReduceFuncs.ArraySorted(c.item("x"), key=lambda v: v, reverse=True)
Dict - c.ReduceFuncs.Dict(c.item("key"), c.item("x"))
DictArray
DictSum
DictSumOrNone
DictMax
DictMin
DictCount
DictCountDistinct
DictFirst
DictLast

and any reduce function of two arguments you pass in c.reduce.

Contrib / Table - stream processing of table-like data

Table helper allows to massage CSVs and table-like data:

join / zip / chain tables
take / drop / rename columns
filter rows
update / update_all values

from convtools.contrib.tables import Table
from convtools import conversion as c

# reads Iterable of rows
(
    Table.from_rows([(0, -1), (1, 2)], header=["a", "b"]).join(
        Table
        # reads tab-separated CSV file
        .from_csv(
            "tests/csvs/ac.csv",
            header=True,
            dialect=Table.csv_dialect(delimiter="\t"),
        )
        # transform column values
        .update(
            a=c.col("a").as_type(float),
            c=c.col("c").as_type(int),
        )
        # filter rows by condition
        .filter(c.col("c") >= 0),
        # joins on column "a" values
        on=["a"],
        how="inner",
    )
    # rearrange columns
    .take(..., "a")
    # this is a generator to consume (tuple, list are supported too)
    .into_iter_rows(dict)
)

Is it any different from tools like Pandas / Polars?

convtools doesn’t wrap data in any container, it just writes and runs the code which perform the conversion you defined
convtools is a lightweight library with no dependencies (however optional black is highly recommended for pretty-printing generated code when debugging)
convtools is about defining and reusing conversions – declarative approach, while wrapping data in high-performance containers is more of being imperative
convtools supports nested aggregations

Is this thing debuggable?

Despite being compiled at runtime, it is (by both pdb and pydevd).

Docs

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.17.0

Jan 20, 2026

1.16.0

Jan 20, 2026

1.15.0

Jan 18, 2026

1.14.8

Oct 22, 2025

1.14.7

Sep 21, 2025

1.14.6

Aug 10, 2025

1.14.5

Jul 30, 2025

1.14.4

Mar 18, 2025

1.14.3

Sep 16, 2024

1.14.2

Sep 4, 2024

1.14.1

Sep 1, 2024

1.14.0

Sep 1, 2024

1.13.2

Aug 29, 2024

1.13.1

Aug 27, 2024

1.13.0 yanked

Aug 22, 2024

Reason this release was yanked:

aggregate CSE optimizer is bugged

1.12.2

Aug 22, 2024

1.12.1

Jul 14, 2024

1.11.0

Jun 30, 2024

1.10.1

Jun 11, 2024

1.10.0

Jun 7, 2024

1.9.0

Jun 5, 2024

1.8.0

Mar 6, 2024

1.7.0

Feb 21, 2024

1.6.0

Feb 6, 2024

1.5.1

Oct 8, 2023

1.5.0

Jul 30, 2023

1.4.0

Jun 28, 2023

1.3.0

Apr 11, 2023

1.2.0

Apr 10, 2023

1.1.2

Mar 27, 2023

1.1.1

Mar 27, 2023

1.1.0

Mar 24, 2023

1.0.0

Mar 23, 2023

0.42.4

Mar 22, 2023

0.42.3

Mar 15, 2023

0.42.2

Mar 14, 2023

0.42.1

Feb 20, 2023

0.42.0

Jan 29, 2023

0.41.0

Jan 16, 2023

This version

0.40.2

Dec 25, 2022

0.40.1

Dec 19, 2022

0.40.0

Dec 18, 2022

0.39.0

Dec 6, 2022

0.38.0

Oct 26, 2022

0.37.0

Sep 29, 2022

0.36.0

Sep 20, 2022

0.35.0

Sep 18, 2022

0.34.0

Jul 26, 2022

0.33.2

Jul 22, 2022

0.33.1

Jul 14, 2022

0.33.0

Jul 14, 2022

0.32.0

Jul 12, 2022

0.31.0

Jul 6, 2022

0.30.0

Jul 6, 2022

0.29.0

Jul 5, 2022

0.28.0

Jul 2, 2022

0.27.0

Jul 1, 2022

0.26.0

Jun 30, 2022

0.25.2

Jun 24, 2022

0.25.1

Jun 24, 2022

0.25.0

Jun 22, 2022

0.24.1

May 29, 2022

0.24.0

May 29, 2022

0.23.3

Mar 11, 2022

0.23.2

Mar 10, 2022

0.23.1

Feb 23, 2022

0.23.0

Feb 22, 2022

0.22.0

Jan 2, 2022

0.21.0

Dec 19, 2021

0.20.2

Dec 2, 2021

0.20.1

Nov 29, 2021

0.20.0 yanked

Nov 28, 2021

Reason this release was yanked:

inconsistent API

0.19.0

Oct 28, 2021

0.18.0

Oct 24, 2021

0.17.0

Oct 14, 2021

0.16.0

Oct 12, 2021

0.15.4

Sep 23, 2021

0.15.3

Sep 19, 2021

0.15.2

Sep 17, 2021

0.15.1

Aug 8, 2021

0.15.0

Aug 2, 2021

0.14.1

Jul 12, 2021

0.14.0

Jun 27, 2021

0.13.4

Jun 20, 2021

0.13.3

Jun 14, 2021

0.13.2

May 27, 2021

0.13.1

May 23, 2021

0.13.0

May 16, 2021

0.12.1

May 13, 2021

0.12.0

May 10, 2021

0.11.2

May 8, 2021

0.11.1

May 7, 2021

0.11.0

May 6, 2021

0.10.0

Apr 28, 2021

0.9.4

Apr 27, 2021

0.9.3

Apr 11, 2021

0.9.2

Mar 28, 2021

0.9.1

Mar 28, 2021

0.9.0

Mar 24, 2021

0.8.0

Jan 3, 2021

0.7.2

Nov 12, 2020

0.7.1

Jul 12, 2020

0.7.0

Jun 14, 2020

0.6.1

May 18, 2020

0.6.0

May 17, 2020

0.5.3

Mar 30, 2020

0.5.2

Mar 29, 2020

0.5.1

Mar 26, 2020

0.5.0

Mar 23, 2020

0.4.0

Mar 19, 2020

0.3.3

Mar 6, 2020

0.3.2

Mar 5, 2020

0.3.1

Mar 5, 2020

0.3.0

Mar 1, 2020

0.2.3

Feb 27, 2020

0.2.2

Feb 25, 2020

0.2.1

Feb 24, 2020

0.2.0

Feb 23, 2020

0.1.1

Feb 18, 2020

0.1.0

Feb 18, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

convtools-0.40.2.tar.gz (57.0 kB view details)

Uploaded Dec 25, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

convtools-0.40.2-py3-none-any.whl (57.5 kB view details)

Uploaded Dec 25, 2022 Python 3

File details

Details for the file convtools-0.40.2.tar.gz.

File metadata

Download URL: convtools-0.40.2.tar.gz
Upload date: Dec 25, 2022
Size: 57.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.12

File hashes

Hashes for convtools-0.40.2.tar.gz
Algorithm	Hash digest
SHA256	`c7d3a7a4cda400fcfdb44806dd6208f3721ccb2039558fef0180b218168436cc`
MD5	`8a8481cd7fb1f94dfd0ea503a2a755f1`
BLAKE2b-256	`83a0a0153808bb65ef30000b8a68386e0e306600341244b6dabc85dd94f9e52b`

See more details on using hashes here.

File details

Details for the file convtools-0.40.2-py3-none-any.whl.

File metadata

Download URL: convtools-0.40.2-py3-none-any.whl
Upload date: Dec 25, 2022
Size: 57.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.12

File hashes

Hashes for convtools-0.40.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e59b562f6ea15353d10a26603c20ab26641708019c704225ad4de222dc7a34b4`
MD5	`9eee2dc102666c3b1dc286b3e65387e2`
BLAKE2b-256	`2e72d171098a4584bb75741e4f4478e4d01345689bcb03891059c6a937a4a63d`

See more details on using hashes here.

convtools 0.40.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Docs

Why would you need this?

Installation:

Conversions - data transforms, aggregations, joins

What reducers are supported by aggregations?

Contrib / Table - stream processing of table-like data

Is it any different from tools like Pandas / Polars?

Is this thing debuggable?

Docs

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes