Skip to main content

Transform columns of pandas DataFrames with element-wise, column-aggregate, and column-threaded operations

Project description

transform-tabular

Tests PyPI version Python versions License: MIT

Apply a single function element-wise across an entire DataFrame, or a selection of its columns.

Two special markers extend this to column-aware operations:

  • ColumnwiseValue(func) — computes a scalar aggregate per column (e.g. mean, max). The scalar is then used in the element-wise expression, so every row in that column sees the same aggregate.
  • ColumnwiseThread(func) — applies a list-to-list transformation per column (e.g. sort, cumulative sum). Each row receives the corresponding element from the transformed list.

This package is a Python port of the Wolfram Language resource function TransformTabular.

Usage

from transform_tabular import transform_tabular, ColumnwiseValue, ColumnwiseThread
import pandas as pd

Syntax

transform_tabular(df, func)                # apply func element-wise to all columns
transform_tabular(df, func, columns)       # apply func only to selected columns
transform_tabular(func)                    # operator form: returns a reusable transformer
transform_tabular(func, columns)           # operator form with column selection

Parameters

Parameter Type Description
df DataFrame Input DataFrame
func callable Function applied element-wise to each cell. May reference ColumnwiseValue / ColumnwiseThread markers.
columns optional Column selection: a name (str), index (int), list of names/indices, or slice. Defaults to all columns.

The function func is applied element-wise. The optional third argument can be either a list of columns, a list of column indices, a single column, or a slice.

Basic transformation

Increment all numeric columns by 1:

df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
transform_tabular(df, lambda x: x + 1)
#    a  b
# 0  2  5
# 1  3  6
# 2  4  7

Column selection

Transform only specific columns:

df = pd.DataFrame({"a": [1, 2, 3], "b": [10, 20, 30], "c": [100, 200, 300]})
transform_tabular(df, lambda x: x * 2, ["a", "c"])
#    a   b    c
# 0  2  10  200
# 1  4  20  400
# 2  6  30  600

ColumnwiseValue (column-level aggregation)

ColumnwiseValue(func) wraps a function func(column_as_list) -> scalar. The scalar is pre-computed per column and then participates in the element-wise arithmetic — every row in a given column sees that column's aggregate.

Subtract the mean from each element (centering):

df = pd.DataFrame({"x": [1, 2, 3, 4, 5], "y": [10, 20, 30, 40, 50]})
cv_mean = ColumnwiseValue(lambda col: sum(col) / len(col))
transform_tabular(df, lambda x: x - cv_mean)
#      x     y
# 0 -2.0 -20.0
# 1 -1.0 -10.0
# 2  0.0   0.0
# 3  1.0  10.0
# 4  2.0  20.0

ColumnwiseThread (column-level transformation)

ColumnwiseThread(func) wraps a function func(column_as_list) -> list_of_same_length. The transformation is pre-computed per column and each row receives its corresponding element from the resulting list.

Compute a cumulative sum for each column independently:

from itertools import accumulate

df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
ct_acc = ColumnwiseThread(lambda col: list(accumulate(col)))
transform_tabular(df, lambda x: ct_acc)
#    a   b
# 0  1   4
# 1  3   9
# 2  6  15

Combined ColumnwiseValue and ColumnwiseThread

Both markers can be used together. For example, compute the cumulative sum of each column and then subtract its mean:

from itertools import accumulate

df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
ct_acc = ColumnwiseThread(lambda col: list(accumulate(col)))
cv_mean = ColumnwiseValue(lambda col: sum(col) / len(col))
transform_tabular(df, lambda x: ct_acc - cv_mean)
#      a     b
# 0 -1.0  -1.0
# 1  1.0   4.0
# 2  4.0  10.0

Operator form

transform_tabular can be curried to produce a reusable transformer:

double_all = transform_tabular(lambda x: x * 2)
double_all(pd.DataFrame({"a": [1, 2], "b": [3, 4]}))
#    a  b
# 0  2  6
# 1  4  8

See also

For further examples and details, see the documentation for the original Wolfram Language resource function: TransformTabular.

Author

Daniele Gregori

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

transform_tabular-0.8.0.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

transform_tabular-0.8.0-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file transform_tabular-0.8.0.tar.gz.

File metadata

  • Download URL: transform_tabular-0.8.0.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for transform_tabular-0.8.0.tar.gz
Algorithm Hash digest
SHA256 fc06173cd3efc567a7a25bd6474e695da733acfcc45726edb6802b9da4a16a37
MD5 38cba8ef6293ae956cc7e5f0629a743d
BLAKE2b-256 59820fedbdbf43e3e5369e090678d6d23fb028c88e34864d0f029cfde3ed5e6a

See more details on using hashes here.

File details

Details for the file transform_tabular-0.8.0-py3-none-any.whl.

File metadata

File hashes

Hashes for transform_tabular-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 31e6d19ea487d659240d0111194993415fb15028964999664db0c6bad66bf066
MD5 96fb4059cc8b27ee0be1f60a2eb9c8c6
BLAKE2b-256 50b53a2c7ce411172495ec8af72593028d7c57bdcade0ed5d64eea9ceff341b5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page