Skip to main content

A Python package for using Polars from configuration

Project description

Polars as Config

This library allows you to define Polars operations using a configuration format (JSON or Python dict), making it easy to serialize, store, and share data processing pipelines.

Quick Start

from polars_as_config.config import run_config

# Define your operations in a config
config = {
    "steps": [
        # Read a CSV file
        {"operation": "scan_csv", "kwargs": {"source": "data.csv"}},

        # Add a new column by joining two string columns
        {
            "operation": "with_columns",
            "kwargs": {
                "full_name": {
                    "expr": "str.concat",
                    "on": {"expr": "col", "kwargs": {"name": "first_name"}},
                    "kwargs": {
                        "delimiter": " ",
                        "other": {"expr": "col", "kwargs": {"name": "last_name"}}
                    }
                }
            }
        }
    ]
}

# Run the config
result = run_config(config)

Config Format

The config describes operations by defining the exact function to execute. Each step in the config:

  1. Defines its operation in the "operation" key
  2. Provides arguments in the "kwargs" key
  3. Can use expressions to define complex operations

Basic Operations

# Reading a CSV file
config = {
    "steps": [
        {"operation": "scan_csv", "kwargs": {"source": "data.csv"}}
    ]
}

# Filtering rows
config = {
    "steps": [
        {
            "operation": "filter",
            "kwargs": {
                "predicate": {
                    "expr": "gt",
                    "on": {"expr": "col", "kwargs": {"name": "age"}},
                    "kwargs": {"other": 18}
                }
            }
        }
    ]
}

String Operations

# String concatenation
config = {
    "steps": [
        {
            "operation": "with_columns",
            "kwargs": {
                "full_name": {
                    "expr": "str.concat",
                    "on": {"expr": "col", "kwargs": {"name": "first"}},
                    "kwargs": {
                        "delimiter": "-",
                        "other": {"expr": "col", "kwargs": {"name": "last"}}
                    }
                }
            }
        }
    ]
}

# String slicing
config = {
    "steps": [
        {
            "operation": "with_columns",
            "kwargs": {
                "sliced": {
                    "expr": "str.slice",
                    "on": {"expr": "col", "kwargs": {"name": "text"}},
                    "kwargs": {
                        "offset": 1,
                        "length": 2
                    }
                }
            }
        }
    ]
}

Date Operations

# Converting strings to datetime
config = {
    "steps": [
        {
            "operation": "with_columns",
            "kwargs": {
                "parsed_date": {
                    "expr": "str.to_datetime",
                    "on": {"expr": "col", "kwargs": {"name": "date_str"}},
                    "kwargs": {
                        "format": "%Y-%m-%d %H:%M%#z"
                    }
                }
            }
        }
    ]
}

Expression Format

Expressions are defined using three keys:

  1. expr: The name of the expression function (e.g., "str.concat", "eq", "gt")
  2. on: The expression to apply the operation to (like "self" in Python)
  3. kwargs: Arguments for the expression
# Example: x > 5 in polars: pl.col("x").gt(5)
{
    "expr": "gt",
    "on": {"expr": "col", "kwargs": {"name": "x"}},
    "kwargs": {"other": 5}
}

# Example: str1 + "-" + str2 in polars: pl.col("str1").str.concat("-", pl.col("str2"))
{
    "expr": "str.concat",
    "on": {"expr": "col", "kwargs": {"name": "str1"}},
    "kwargs": {
        "delimiter": "-",
        "other": {"expr": "col", "kwargs": {"name": "str2"}}
    }
}

Installation

pip install polars-as-config

Requirements

  • Polars

License

See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_as_config-0.1.0.tar.gz (10.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polars_as_config-0.1.0-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file polars_as_config-0.1.0.tar.gz.

File metadata

  • Download URL: polars_as_config-0.1.0.tar.gz
  • Upload date:
  • Size: 10.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for polars_as_config-0.1.0.tar.gz
Algorithm Hash digest
SHA256 641055bec2f42d622e66b8926c9cf62c830a0200d3389f09824397a122ba0b08
MD5 2716e3ee84ce51bd4059407e20b21ea6
BLAKE2b-256 29688f11c8ad780f29fe836e51bd820bf3b59e6aac848211e1e5dced709dde73

See more details on using hashes here.

File details

Details for the file polars_as_config-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for polars_as_config-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cee871199458d58abd73e818f85c16fded7d023fccd0a17734d39ad10c0e893b
MD5 53ff7f13391954259f9e485d49694bdf
BLAKE2b-256 57a2bc0e40531aa7b4b786c14ff0f5c7fe4990a6c7712f41454ab6997f9f095b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page