Skip to main content

Enhanced utilities for Polars DataFrames

Project description

EnhancedPolars

Enhanced utilities for Polars DataFrames, providing pandas-like convenience while maintaining Polars' blazing-fast performance.

Features

  • Unified Namespace (epl) - Access all extensions through df.epl.*
  • Pandas-like Indexing - loc, iloc, at, iat accessors
  • Enhanced Merging - Automatic dtype resolution and asof joins
  • ML Pipeline Tools - Standardization, encoding, and preprocessing
  • SQL Integration - Direct DataFrame upload to databases
  • Time Series Utilities - Interpolation, boundaries, and cohort analysis
  • Statistical Functions - Hypothesis tests and descriptive statistics
  • Full LazyFrame Support - All operations work with lazy evaluation

Installation

pip install EnhancedPolars

Or with uv:

uv add EnhancedPolars

Quick Start

import polars as pl
from enhancedpolars import epl
from enhancedpolars.register import *  # Register the 'epl' namespace

# Read data with automatic type optimization
df = epl.read_csv("data.csv", cleanup=True)

# Pandas-like indexing
subset = df.epl.loc[0:10, ["col1", "col2"]]
value = df.epl.at[0, "col1"]

# Enhanced merging with automatic dtype resolution
result = df.epl.merge(other_df, on="key", how="left")

# Time-aware asof join
trades = trades_df.epl.merge_asof(prices_df, on="timestamp", strategy="backward")

# GroupBy with pandas-style syntax
summary = df.epl.groupby(["category"]).agg({
    "value": ["mean", "sum", "max"],
    "count": "sum"
})

# ML preprocessing
df_ready, metadata = df.epl.make_ml_ready(
    target_col="label",
    default_numeric_scaler="StandardScaler"
)

# SQL upload
df.epl.to_sql(connection, "table_name", if_exists="replace")

Documentation

Document Description
Getting Started Installation and quick start guide
API Reference Complete API documentation
Examples Practical usage examples
Contributing Contributing guidelines and style guide

Key Modules

Indexing

df.epl.loc[0]                      # First row
df.epl.loc[0:5, ["a", "b"]]        # Slice with columns
df.epl.iloc[0, 0]                  # Position-based access
df.epl.at[0, "column"]             # Single value access

Merging

df.epl.merge(right, on="key", how="left")
df.epl.merge_asof(right, on="timestamp", by="group")
df.epl.concat(df2, df3, how="vertical")

GroupBy

df.epl.groupby(["col"]).agg({"value": ["mean", "sum"]})
df.epl.groupby(["col"]).apply(custom_function)
df.epl.groupby(["col"]).ffill(columns=["value"])

Interpolation

df.epl.ffill(columns=["value"])
df.epl.bfill(columns=["value"])
df.epl.interpolate(columns=["value"], method="linear")
df.epl.fillna(columns=["value"], value=0)

ML Pipeline

df.epl.standardize(columns=["numeric_col"])
df.epl.clip_and_impute(columns=["value"], impute_strategy="median")
df.epl.make_ml_ready(target_col="label")

# Series-level operations
series.epl.scale_encode(path="scaler.joblib", scaler_type="StandardScaler")
series.epl.isnull()  # Handles both null and NaN

SQL

df.epl.to_sql(connection, "table_name", if_exists="replace", batch_size=10000)

Requirements

Core Dependencies

  • Python 3.12+
  • polars >= 1.33.0
  • numpy >= 2.3.0
  • pandas >= 2.3.0
  • pyarrow >= 21.0.0
  • Plus: tqdm, python-dateutil, joblib

Optional Dependencies

Install with extras for additional functionality:

# For scientific computing (interpolation, hypothesis tests)
pip install "EnhancedPolars[sci]"

# For ML preprocessing (scalers, encoders)
pip install "EnhancedPolars[ml]"

# Install all optional dependencies
pip install "EnhancedPolars[all]"
  • [sci] - Scientific computing: scipy >= 1.16.0
  • [ml] - Machine learning: scikit-learn >= 1.7.0

License

This project is licensed under the MIT License - see the LICENSE file for details.


Author

@Ruppert20


AI Authorship Disclaimer

This package was developed with the assistance of LLM-based coding tools (Claude Code by Anthropic). AI tools were used for the following activities:

  • Code authorship - Implementation of utilities, functions, and classes
  • Test development - Creation of comprehensive unit tests
  • Documentation - Generation of NumPy-style docstrings and README content
  • Code review - Identification of bugs, edge cases, and improvements

Users should evaluate the code for their specific use cases and report any issues through the GitHub issue tracker.


Contributing

See Contributing Guide for guidelines on:

  • Development setup
  • Code style
  • Testing requirements
  • Pull request process

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

enhancedpolars-0.0.3.tar.gz (282.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

enhancedpolars-0.0.3-py3-none-any.whl (170.8 kB view details)

Uploaded Python 3

File details

Details for the file enhancedpolars-0.0.3.tar.gz.

File metadata

  • Download URL: enhancedpolars-0.0.3.tar.gz
  • Upload date:
  • Size: 282.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for enhancedpolars-0.0.3.tar.gz
Algorithm Hash digest
SHA256 4b6d3a205e90f658805513e621b6e2c7a921f636c92fa19c333bb816e8990ec7
MD5 e7aeafeeabda6c0e25491ed2d7ae7013
BLAKE2b-256 e272ac6abc5f009e5cb8c1510757bb52c44313719fc060189bb25bc7ea4ab25a

See more details on using hashes here.

Provenance

The following attestation bundles were made for enhancedpolars-0.0.3.tar.gz:

Publisher: release.yml on ruppert20/PolarsUtils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file enhancedpolars-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: enhancedpolars-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 170.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for enhancedpolars-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 0d728c3583ec6d5c2eec2baa90f872ad987cc289ed9d3b5a91e17a873d397e68
MD5 67e3b5155275de75204bbff5867ee049
BLAKE2b-256 66ad8827ec0f14d3721dd1daab27db4d0a11c235af520876a68015b2b6ccf312

See more details on using hashes here.

Provenance

The following attestation bundles were made for enhancedpolars-0.0.3-py3-none-any.whl:

Publisher: release.yml on ruppert20/PolarsUtils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page