Skip to main content

Enhanced utilities for Polars DataFrames

Project description

EnhancedPolars

Enhanced utilities for Polars DataFrames, providing pandas-like convenience while maintaining Polars' blazing-fast performance.

Features

  • Unified Namespace (epl) - Access all extensions through df.epl.*
  • Pandas-like Indexing - loc, iloc, at, iat accessors
  • Enhanced Merging - Automatic dtype resolution and asof joins
  • ML Pipeline Tools - Standardization, encoding, and preprocessing
  • SQL Integration - Direct DataFrame upload to databases
  • Time Series Utilities - Interpolation, boundaries, and cohort analysis
  • Statistical Functions - Hypothesis tests and descriptive statistics
  • Full LazyFrame Support - All operations work with lazy evaluation

Installation

pip install EnhancedPolars

Or with uv:

uv add EnhancedPolars

Quick Start

import polars as pl
from enhancedpolars import epl
from enhancedpolars.register import *  # Register the 'epl' namespace

# Read data with automatic type optimization
df = epl.read_csv("data.csv", cleanup=True)

# Pandas-like indexing
subset = df.epl.loc[0:10, ["col1", "col2"]]
value = df.epl.at[0, "col1"]

# Enhanced merging with automatic dtype resolution
result = df.epl.merge(other_df, on="key", how="left")

# Time-aware asof join
trades = trades_df.epl.merge_asof(prices_df, on="timestamp", strategy="backward")

# GroupBy with pandas-style syntax
summary = df.epl.groupby(["category"]).agg({
    "value": ["mean", "sum", "max"],
    "count": "sum"
})

# ML preprocessing
df_ready, metadata = df.epl.make_ml_ready(
    target_col="label",
    default_numeric_scaler="StandardScaler"
)

# SQL upload
df.epl.to_sql(connection, "table_name", if_exists="replace")

Documentation

Document Description
Getting Started Installation and quick start guide
API Reference Complete API documentation
Examples Practical usage examples
Contributing Contributing guidelines and style guide

Key Modules

Indexing

df.epl.loc[0]                      # First row
df.epl.loc[0:5, ["a", "b"]]        # Slice with columns
df.epl.iloc[0, 0]                  # Position-based access
df.epl.at[0, "column"]             # Single value access

Merging

df.epl.merge(right, on="key", how="left")
df.epl.merge_asof(right, on="timestamp", by="group")
df.epl.concat(df2, df3, how="vertical")

GroupBy

df.epl.groupby(["col"]).agg({"value": ["mean", "sum"]})
df.epl.groupby(["col"]).apply(custom_function)
df.epl.groupby(["col"]).ffill(columns=["value"])

Interpolation

df.epl.ffill(columns=["value"])
df.epl.bfill(columns=["value"])
df.epl.interpolate(columns=["value"], method="linear")
df.epl.fillna(columns=["value"], value=0)

ML Pipeline

df.epl.standardize(columns=["numeric_col"])
df.epl.clip_and_impute(columns=["value"], impute_strategy="median")
df.epl.make_ml_ready(target_col="label")

# Series-level operations
series.epl.scale_encode(path="scaler.joblib", scaler_type="StandardScaler")
series.epl.isnull()  # Handles both null and NaN

SQL

df.epl.to_sql(connection, "table_name", if_exists="replace", batch_size=10000)

Requirements

Core Dependencies

  • Python 3.13.2+
  • polars >= 1.33.0
  • numpy >= 2.3.0
  • pandas >= 2.3.0
  • pyarrow >= 21.0.0
  • Plus: tqdm, python-dateutil, joblib

Optional Dependencies

Install with extras for additional functionality:

# For scientific computing (interpolation, hypothesis tests)
pip install "EnhancedPolars[sci]"

# For ML preprocessing (scalers, encoders)
pip install "EnhancedPolars[ml]"

# Install all optional dependencies
pip install "EnhancedPolars[all]"
  • [sci] - Scientific computing: scipy >= 1.16.0
  • [ml] - Machine learning: scikit-learn >= 1.7.0

License

This project is licensed under the MIT License - see the LICENSE file for details.


Author

@Ruppert20


AI Authorship Disclaimer

This package was developed with the assistance of LLM-based coding tools (Claude Code by Anthropic). AI tools were used for the following activities:

  • Code authorship - Implementation of utilities, functions, and classes
  • Test development - Creation of comprehensive unit tests
  • Documentation - Generation of NumPy-style docstrings and README content
  • Code review - Identification of bugs, edge cases, and improvements

Users should evaluate the code for their specific use cases and report any issues through the GitHub issue tracker.


Contributing

See Contributing Guide for guidelines on:

  • Development setup
  • Code style
  • Testing requirements
  • Pull request process

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

enhancedpolars-0.0.1.tar.gz (285.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

enhancedpolars-0.0.1-py3-none-any.whl (170.6 kB view details)

Uploaded Python 3

File details

Details for the file enhancedpolars-0.0.1.tar.gz.

File metadata

  • Download URL: enhancedpolars-0.0.1.tar.gz
  • Upload date:
  • Size: 285.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for enhancedpolars-0.0.1.tar.gz
Algorithm Hash digest
SHA256 77ad89954640c2bd6d39716cc2e735d80452c543579cae66114023c5f55b0800
MD5 fdb02d9e1913e9c1c615c280ab22e1ae
BLAKE2b-256 a02362ec72bdc839ac5170dba5c395e2a9548eaf3b79e425f74ea3930dc7eb5b

See more details on using hashes here.

Provenance

The following attestation bundles were made for enhancedpolars-0.0.1.tar.gz:

Publisher: release.yml on ruppert20/PolarsUtils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file enhancedpolars-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: enhancedpolars-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 170.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for enhancedpolars-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b733d63bf3c485dc0af994a3e1a5ecb7ee4368a146ac68b5caab2121c3792687
MD5 c076b032fc1497bbaa378937c8b7dc4c
BLAKE2b-256 ea2ad2d3fb0ca719192cc8acc6d8af2d2af7900e249ff5135d3ce1ed1acacb15

See more details on using hashes here.

Provenance

The following attestation bundles were made for enhancedpolars-0.0.1-py3-none-any.whl:

Publisher: release.yml on ruppert20/PolarsUtils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page