Transform string-based expressions into Polars DataFrame operations
Project description
Polars Expression Transformer
Transform string-based expressions into Polars DataFrame operations. Write simple, SQL-like expressions and let the library convert them to optimized Polars code.
Quick Start
import polars as pl
from polars_expr_transformer import simple_function_to_expr
df = pl.DataFrame({
'first_name': ['John', 'Jane', 'Bob'],
'last_name': ['Doe', 'Smith', 'Johnson'],
'age': [30, 25, 45],
'salary': [50000, 60000, 75000]
})
# Concatenate columns
df.select(simple_function_to_expr('concat([first_name], " ", [last_name])').alias('full_name'))
# Conditional logic
df.select(simple_function_to_expr('if [age] > 30 then "Senior" else "Junior" endif').alias('level'))
# Math operations
df.select(simple_function_to_expr('[salary] * 1.1').alias('new_salary'))
# Combine multiple operations
df.select(simple_function_to_expr('uppercase(left([last_name], 3))').alias('code'))
Installation
pip install polars-expr-transformer
Why Use This Library?
| Use Case | Recommendation |
|---|---|
| Building applications with user-defined transformations | ✅ Yes - Users can write expressions without Python knowledge |
| SQL/Tableau users transitioning to Polars | ✅ Yes - Familiar syntax |
| Need a simple expression language for configs | ✅ Yes - Easy to serialize and store |
| Writing performance-critical Polars code | ❌ No - Use Polars directly |
| Need all Polars features | ❌ No - This covers common operations only |
Expression Syntax
Column References
Reference DataFrame columns using square brackets:
'[column_name]' # Reference a column
'[Column With Spaces]' # Columns with spaces work too
Operators
| Operator | Description | Example |
|---|---|---|
+ |
Addition | [a] + [b] |
- |
Subtraction | [a] - 10 |
* |
Multiplication | [price] * [quantity] |
/ |
Division | [total] / [count] |
% |
Modulo | [value] % 2 |
= or == |
Equals | [status] = "active" |
!= |
Not equals | [type] != "deleted" |
>, >=, <, <= |
Comparisons | [age] >= 18 |
and |
Logical AND | [a] > 0 and [b] > 0 |
or |
Logical OR | [x] = 1 or [y] = 1 |
Conditional Expressions
# Simple if-then-else
'if [age] >= 18 then "Adult" else "Minor" endif'
# Multiple conditions with elseif
'if [score] >= 90 then "A" elseif [score] >= 80 then "B" elseif [score] >= 70 then "C" else "F" endif'
# Nested conditions
'if [type] = "A" then (if [value] > 100 then "High A" else "Low A" endif) else "Other" endif'
Comments
# Single-line comments with //
'[column] + 1 // This adds one to the column'
# Multi-line expressions with comments
'''
[price] * [quantity] // Calculate subtotal
- [discount] // Apply discount
'''
Available Functions
String Functions
| Function | Description | Example |
|---|---|---|
concat(a, b, ...) |
Concatenate strings | concat([first], " ", [last]) |
length(text) |
String length | length([name]) |
uppercase(text) |
Convert to uppercase | uppercase([code]) |
lowercase(text) |
Convert to lowercase | lowercase([email]) |
titlecase(text) |
Convert to title case | titlecase([name]) |
left(text, n) |
First n characters | left([phone], 3) |
right(text, n) |
Last n characters | right([id], 4) |
mid(text, start, len) |
Substring from position | mid([code], 2, 3) |
substring(text, start, len) |
Alias for mid | substring([text], 0, 10) |
trim(text) |
Remove leading/trailing spaces | trim([input]) |
left_trim(text) |
Remove leading spaces | left_trim([text]) |
right_trim(text) |
Remove trailing spaces | right_trim([text]) |
replace(text, find, replace) |
Replace text | replace([name], ".", "") |
find_position(text, search) |
Find substring position | find_position([text], "@") |
pad_left(text, len, char) |
Pad string on left | pad_left([id], 5, "0") |
pad_right(text, len, char) |
Pad string on right | pad_right([code], 10, " ") |
starts_with(text, prefix) |
Check prefix | starts_with([url], "https") |
ends_with(text, suffix) |
Check suffix | ends_with([file], ".csv") |
reverse(text) |
Reverse string | reverse([text]) |
repeat(text, n) |
Repeat string n times | repeat("*", 5) |
split(text, delimiter) |
Split into list | split([tags], ",") |
count_match(text, pattern) |
Count occurrences | count_match([text], "a") |
string_similarity(a, b, method) |
Similarity score (0-1) | string_similarity([a], [b], "levenshtein") |
Math Functions
| Function | Description | Example |
|---|---|---|
abs(n) |
Absolute value | abs([difference]) |
round(n, decimals) |
Round to decimals | round([price], 2) |
ceil(n) |
Round up | ceil([value]) |
floor(n) |
Round down | floor([value]) |
power(base, exp) |
Exponentiation | power([x], 2) |
pow(base, exp) |
Alias for power | pow(2, [n]) |
sqrt(n) |
Square root | sqrt([area]) |
log(n) |
Natural logarithm | log([value]) |
log10(n) |
Base-10 logarithm | log10([value]) |
log2(n) |
Base-2 logarithm | log2([value]) |
exp(n) |
e^n | exp([rate]) |
mod(a, b) |
Modulo | mod([value], 10) |
sign(n) |
Sign (-1, 0, 1) | sign([change]) |
negation(n) |
Negate value | negation([amount]) |
sin(n), cos(n), tan(n) |
Trigonometric | sin([angle]) |
asin(n), acos(n), atan(n) |
Inverse trig | asin([ratio]) |
tanh(n) |
Hyperbolic tangent | tanh([x]) |
random_int(min, max) |
Random integer | random_int(1, 100) |
Date Functions
| Function | Description | Example |
|---|---|---|
now() |
Current datetime | now() |
today() |
Current date | today() |
year(date) |
Extract year | year([created_at]) |
month(date) |
Extract month (1-12) | month([date]) |
day(date) |
Extract day (1-31) | day([date]) |
hour(datetime) |
Extract hour (0-23) | hour([timestamp]) |
minute(datetime) |
Extract minute | minute([time]) |
second(datetime) |
Extract second | second([time]) |
week(date) |
ISO week number (1-53) | week([date]) |
weekday(date) |
Day of week (1=Mon, 7=Sun) | weekday([date]) |
dayofweek(date) |
Alias for weekday | dayofweek([date]) |
quarter(date) |
Quarter (1-4) | quarter([date]) |
dayofyear(date) |
Day of year (1-366) | dayofyear([date]) |
add_days(date, n) |
Add days | add_days([start], 30) |
add_weeks(date, n) |
Add weeks | add_weeks([date], 2) |
add_months(date, n) |
Add months | add_months([date], 6) |
add_years(date, n) |
Add years | add_years([birth], 18) |
add_hours(dt, n) |
Add hours | add_hours([time], 3) |
add_minutes(dt, n) |
Add minutes | add_minutes([time], 30) |
add_seconds(dt, n) |
Add seconds | add_seconds([time], 60) |
date_diff_days(a, b) |
Days between dates | date_diff_days([end], [start]) |
datetime_diff_seconds(a, b) |
Seconds between | datetime_diff_seconds([a], [b]) |
format_date(date, fmt) |
Format as string | format_date([date], "%Y-%m-%d") |
start_of_month(date) |
First of month | start_of_month([date]) |
end_of_month(date) |
Last of month | end_of_month([date]) |
date_truncate(date, unit) |
Truncate to unit | date_truncate([dt], "1day") |
Logic & Null Handling
| Function | Description | Example |
|---|---|---|
equals(a, b) |
Check equality | equals([status], "active") |
does_not_equal(a, b) |
Check inequality | does_not_equal([type], "deleted") |
is_empty(value) |
Check if null | is_empty([email]) |
is_not_empty(value) |
Check if not null | is_not_empty([phone]) |
coalesce(a, b, ...) |
First non-null | coalesce([nickname], [name], "Unknown") |
ifnull(value, default) |
Replace null | ifnull([count], 0) |
nvl(value, default) |
Alias for ifnull | nvl([value], 0) |
nullif(a, b) |
Null if equal | nullif([value], 0) |
between(val, min, max) |
Range check (inclusive) | between([age], 18, 65) |
greatest(a, b, ...) |
Maximum value | greatest([a], [b], [c]) |
least(a, b, ...) |
Minimum value | least([price1], [price2]) |
contains(text, search) |
Contains substring | contains([desc], "sale") |
_in(value, text) |
Value in text | _in("admin", [roles]) |
_not(value) |
Logical NOT | _not([is_deleted]) |
is_string(value) |
Type check | is_string([field]) |
Type Conversions
| Function | Description | Example |
|---|---|---|
to_string(value) |
Convert to string | to_string([id]) |
to_integer(value) |
Convert to integer | to_integer([count]) |
to_float(value) |
Convert to float | to_float([price]) |
to_number(value) |
Alias for to_float | to_number([value]) |
to_boolean(value) |
Convert to boolean | to_boolean([flag]) |
to_date(text, format) |
Parse date | to_date([date_str], "%Y-%m-%d") |
to_datetime(text, format) |
Parse datetime | to_datetime([ts], "%Y-%m-%d %H:%M:%S") |
to_decimal(value, precision) |
Convert with precision | to_decimal([amount], 2) |
API Reference
simple_function_to_expr(expression: str) -> pl.Expr
Converts a string expression to a Polars expression.
from polars_expr_transformer import simple_function_to_expr
expr = simple_function_to_expr('[price] * [quantity]')
df.select(expr.alias('total'))
build_func(expression: str) -> Func
Returns the intermediate function object for inspection/debugging.
from polars_expr_transformer import build_func
func = build_func('concat([a], [b])')
print(func.get_readable_pl_function()) # See the Polars translation
get_all_expressions() -> List[str]
Returns a list of all available function names.
from polars_expr_transformer import get_all_expressions
functions = get_all_expressions()
print(functions) # ['concat', 'length', 'uppercase', ...]
get_expression_overview() -> List[ExpressionsOverview]
Returns functions grouped by category with descriptions.
from polars_expr_transformer import get_expression_overview
for category in get_expression_overview():
print(f"\n{category.category}:")
for expr in category.expressions:
print(f" {expr.name}: {expr.description}")
Error Handling
The library validates expressions and provides helpful error messages:
# Unbalanced parentheses
simple_function_to_expr('((1)')
# ValueError: Unbalanced parentheses: 1 unclosed '(' found
# Unknown function
simple_function_to_expr('unknown_func([col])')
# Raises error with available functions
Built on Polars
This library is built on top of Polars, a blazingly fast DataFrame library written in Rust. All expressions are converted to native Polars operations, ensuring optimal performance.
Contributing
Contributions are welcome! Please feel free to submit issues and pull requests on GitHub.
License
MIT License - see LICENSE file for details.
Acknowledgements
Thanks to the Polars team for creating such an amazing library.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polars_expr_transformer-0.5.2.tar.gz.
File metadata
- Download URL: polars_expr_transformer-0.5.2.tar.gz
- Upload date:
- Size: 41.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd8b0465006ab0db740eede7eaeed4a4ca0ac31fdc9f87f2e73158fb726609fb
|
|
| MD5 |
4f94aadcc2d21e62b9c16c867f6f62a5
|
|
| BLAKE2b-256 |
6ba2b0de1062027058a18ee4d769f29c057a9a3e64d657bd1c4941844a5ab5b5
|
Provenance
The following attestation bundles were made for polars_expr_transformer-0.5.2.tar.gz:
Publisher:
ci-cd.yml on Edwardvaneechoud/polars_expr_transformer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polars_expr_transformer-0.5.2.tar.gz -
Subject digest:
fd8b0465006ab0db740eede7eaeed4a4ca0ac31fdc9f87f2e73158fb726609fb - Sigstore transparency entry: 1192072779
- Sigstore integration time:
-
Permalink:
Edwardvaneechoud/polars_expr_transformer@0d6874251e0686ba196d903ffb21dcc4a22ca487 -
Branch / Tag:
refs/tags/v0.5.2 - Owner: https://github.com/Edwardvaneechoud
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci-cd.yml@0d6874251e0686ba196d903ffb21dcc4a22ca487 -
Trigger Event:
push
-
Statement type:
File details
Details for the file polars_expr_transformer-0.5.2-py3-none-any.whl.
File metadata
- Download URL: polars_expr_transformer-0.5.2-py3-none-any.whl
- Upload date:
- Size: 46.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6bfd93a7f270114cd8a2eb9b8942f3f970a0a4417a09c538310e8044a283bf1b
|
|
| MD5 |
fa6f9e6a38853c39d2e31160f77ec977
|
|
| BLAKE2b-256 |
635438bb6246b4ff96dc8dec2956da355bd90e168da2a03cf5fce6f79632c666
|
Provenance
The following attestation bundles were made for polars_expr_transformer-0.5.2-py3-none-any.whl:
Publisher:
ci-cd.yml on Edwardvaneechoud/polars_expr_transformer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polars_expr_transformer-0.5.2-py3-none-any.whl -
Subject digest:
6bfd93a7f270114cd8a2eb9b8942f3f970a0a4417a09c538310e8044a283bf1b - Sigstore transparency entry: 1192072781
- Sigstore integration time:
-
Permalink:
Edwardvaneechoud/polars_expr_transformer@0d6874251e0686ba196d903ffb21dcc4a22ca487 -
Branch / Tag:
refs/tags/v0.5.2 - Owner: https://github.com/Edwardvaneechoud
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci-cd.yml@0d6874251e0686ba196d903ffb21dcc4a22ca487 -
Trigger Event:
push
-
Statement type: