Skip to main content

dynamic, declarative data transformations with automatic code generation

Project description

convtools

convtools is a specialized Python library designed for defining data transformations dynamically using a declarative approach. It automatically generates custom Python code for the user in the background.

License codecov Tests status Docs status PyPI Twitter Downloads Python versions


Installation

pip install convtools

Documentation

convtools.readthedocs.io

Group by example

from convtools import conversion as c

input_data = [
    {"a": 5, "b": "foo"},
    {"a": 10, "b": "foo"},
    {"a": 10, "b": "bar"},
    {"a": 10, "b": "bar"},
    {"a": 20, "b": "bar"},
]

conv = (
    c.group_by(c.item("b"))
    .aggregate(
        {
            "b": c.item("b"),
            "a_first": c.ReduceFuncs.First(c.item("a")),
            "a_max": c.ReduceFuncs.Max(c.item("a")),
        }
    )
    .pipe(
        c.aggregate({
            "b_values": c.ReduceFuncs.Array(c.item("b")),
            "mode_a_first": c.ReduceFuncs.Mode(c.item("a_first")),
            "median_a_max": c.ReduceFuncs.Median(c.item("a_max")),
        })
    )
    .gen_converter()
)

assert conv(input_data) == {
    'b_values': ['foo', 'bar'],
    'mode_a_first': 10,
    'median_a_max': 15.0
}
Built-in reducers like c.ReduceFuncs.First
* Sum
* SumOrNone
* Max
* MaxRow
* Min
* MinRow
* Count
* CountDistinct
* First
* Last
* Average
* Median
* Percentile
* Mode
* TopK
* Array
* ArrayDistinct
* ArraySorted

DICT REDUCERS ARE IN FACT AGGREGATIONS THEMSELVES, BECAUSE VALUES GET REDUCED.
* Dict
* DictArray
* DictSum
* DictSumOrNone
* DictMax
* DictMin
* DictCount
* DictCountDistinct
* DictFirst
* DictLast

AND LASTLY YOU CAN DEFINE YOUR OWN REDUCER BY PASSING ANY REDUCE FUNCTION
OF TWO ARGUMENTS TO ``c.reduce``.

What's the point if there are tools like Pandas / Polars?

  • convtools doesn't need to wrap data in a container to provide functionality, it simply runs the python code it generates on any input
  • convtools is lightweight (though optional black is highly recommended for pretty-printing generated code out of curiosity)
  • convtools fosters building pipelines on top of iterators, allowing for stream processing
  • convtools supports nested aggregations
  • convtools is a set of primitives for code generation, so it's just different.

Support

Reporting a Security Vulnerability

See the security policy.

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

convtools-1.7.0.tar.gz (63.6 kB view hashes)

Uploaded source

Built Distribution

convtools-1.7.0-py3-none-any.whl (72.7 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page