Skip to main content

dynamic, declarative data transformations with automatic code generation

Project description

convtools

convtools is a Python library that simplifies data transformation by allowing you to define them in a declarative way. It then generates the necessary Python code in the background, saving you time and effort.

License codecov Tests status Docs status PyPI Twitter Downloads Python versions


Installation

pip install convtools

Documentation

convtools.readthedocs.io

Group by example

from convtools import conversion as c

input_data = [
    {"a": 5, "b": "foo"},
    {"a": 10, "b": "foo"},
    {"a": 10, "b": "bar"},
    {"a": 10, "b": "bar"},
    {"a": 20, "b": "bar"},
]

conv = (
    c.group_by(c.item("b"))
    .aggregate(
        {
            "b": c.item("b"),
            "a_first": c.ReduceFuncs.First(c.item("a")),
            "a_max": c.ReduceFuncs.Max(c.item("a")),
        }
    )
    .pipe(
        c.aggregate({
            "b_values": c.ReduceFuncs.Array(c.item("b")),
            "mode_a_first": c.ReduceFuncs.Mode(c.item("a_first")),
            "median_a_max": c.ReduceFuncs.Median(c.item("a_max")),
        })
    )
    .gen_converter()
)

assert conv(input_data) == {
    'b_values': ['foo', 'bar'],
    'mode_a_first': 10,
    'median_a_max': 15.0
}
Built-in reducers like c.ReduceFuncs.First
* Sum
* SumOrNone
* Max
* MaxRow
* Min
* MinRow
* Count
* CountDistinct
* First
* Last
* Average
* Median
* Percentile
* Mode
* TopK
* Array
* ArrayDistinct
* ArraySorted

DICT REDUCERS ARE IN FACT AGGREGATIONS THEMSELVES, BECAUSE VALUES GET REDUCED.
* Dict
* DictArray
* DictSum
* DictSumOrNone
* DictMax
* DictMin
* DictCount
* DictCountDistinct
* DictFirst
* DictLast

AND LASTLY YOU CAN DEFINE YOUR OWN REDUCER BY PASSING ANY REDUCE FUNCTION
OF TWO ARGUMENTS TO ``c.reduce``.

What's the point if there are tools like Pandas / Polars?

  • convtools doesn't need to wrap data in a container to provide functionality, it simply runs the python code it generates on any input
  • convtools is lightweight (though optional black is highly recommended for pretty-printing generated code out of curiosity)
  • convtools fosters building pipelines on top of iterators, allowing for stream processing
  • convtools supports nested aggregations
  • convtools is a set of primitives for code generation, so it's just different.

Reporting a Security Vulnerability

See the security policy.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

convtools-1.12.1.tar.gz (120.4 kB view hashes)

Uploaded Source

Built Distributions

convtools-1.12.1-cp310-abi3-win_amd64.whl (89.7 kB view hashes)

Uploaded CPython 3.10+ Windows x86-64

convtools-1.12.1-cp310-abi3-musllinux_1_2_x86_64.whl (90.8 kB view hashes)

Uploaded CPython 3.10+ musllinux: musl 1.2+ x86-64

convtools-1.12.1-cp310-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (91.0 kB view hashes)

Uploaded CPython 3.10+ manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

convtools-1.12.1-cp310-abi3-macosx_11_0_arm64.whl (86.2 kB view hashes)

Uploaded CPython 3.10+ macOS 11.0+ ARM64

convtools-1.12.1-cp310-abi3-macosx_10_9_x86_64.whl (85.9 kB view hashes)

Uploaded CPython 3.10+ macOS 10.9+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page