Skip to main content

Enhanced utilities for Polars DataFrames

Project description

EnhancedPolars

Enhanced utilities for Polars DataFrames, providing pandas-like convenience while maintaining Polars' blazing-fast performance.

Features

  • Unified Namespace (epl) - Access all extensions through df.epl.*
  • Pandas-like Indexing - loc, iloc, at, iat accessors
  • Enhanced Merging - Automatic dtype resolution and asof joins
  • ML Pipeline Tools - Standardization, encoding, and preprocessing
  • SQL Integration - Direct DataFrame upload to databases
  • Time Series Utilities - Interpolation, boundaries, and cohort analysis
  • Statistical Functions - Hypothesis tests and descriptive statistics
  • Full LazyFrame Support - All operations work with lazy evaluation

Installation

pip install EnhancedPolars

Or with uv:

uv add EnhancedPolars

Quick Start

import polars as pl
from enhancedpolars import epl
from enhancedpolars.register import *  # Register the 'epl' namespace

# Read data with automatic type optimization
df = epl.read_csv("data.csv", cleanup=True)

# Pandas-like indexing
subset = df.epl.loc[0:10, ["col1", "col2"]]
value = df.epl.at[0, "col1"]

# Enhanced merging with automatic dtype resolution
result = df.epl.merge(other_df, on="key", how="left")

# Time-aware asof join
trades = trades_df.epl.merge_asof(prices_df, on="timestamp", strategy="backward")

# GroupBy with pandas-style syntax
summary = df.epl.groupby(["category"]).agg({
    "value": ["mean", "sum", "max"],
    "count": "sum"
})

# ML preprocessing
df_ready, metadata = df.epl.make_ml_ready(
    target_col="label",
    default_numeric_scaler="StandardScaler"
)

# SQL upload
df.epl.to_sql(connection, "table_name", if_exists="replace")

Documentation

Document Description
Getting Started Installation and quick start guide
API Reference Complete API documentation
Examples Practical usage examples
Contributing Contributing guidelines and style guide

Key Modules

Indexing

df.epl.loc[0]                      # First row
df.epl.loc[0:5, ["a", "b"]]        # Slice with columns
df.epl.iloc[0, 0]                  # Position-based access
df.epl.at[0, "column"]             # Single value access

Merging

df.epl.merge(right, on="key", how="left")
df.epl.merge_asof(right, on="timestamp", by="group")
df.epl.concat(df2, df3, how="vertical")

GroupBy

df.epl.groupby(["col"]).agg({"value": ["mean", "sum"]})
df.epl.groupby(["col"]).apply(custom_function)
df.epl.groupby(["col"]).ffill(columns=["value"])

Interpolation

df.epl.ffill(columns=["value"])
df.epl.bfill(columns=["value"])
df.epl.interpolate(columns=["value"], method="linear")
df.epl.fillna(columns=["value"], value=0)

ML Pipeline

df.epl.standardize(columns=["numeric_col"])
df.epl.clip_and_impute(columns=["value"], impute_strategy="median")
df.epl.make_ml_ready(target_col="label")

# Series-level operations
series.epl.scale_encode(path="scaler.joblib", scaler_type="StandardScaler")
series.epl.isnull()  # Handles both null and NaN

SQL

df.epl.to_sql(connection, "table_name", if_exists="replace", batch_size=10000)

Requirements

Core Dependencies

  • Python 3.12+
  • polars >= 1.33.0
  • numpy >= 2.3.0
  • pandas >= 2.3.0
  • pyarrow >= 21.0.0
  • Plus: tqdm, python-dateutil, joblib

Optional Dependencies

Install with extras for additional functionality:

# For scientific computing (interpolation, hypothesis tests)
pip install "EnhancedPolars[sci]"

# For ML preprocessing (scalers, encoders)
pip install "EnhancedPolars[ml]"

# Install all optional dependencies
pip install "EnhancedPolars[all]"
  • [sci] - Scientific computing: scipy >= 1.16.0
  • [ml] - Machine learning: scikit-learn >= 1.7.0

License

This project is licensed under the MIT License - see the LICENSE file for details.


Author

@Ruppert20


AI Authorship Disclaimer

This package was developed with the assistance of LLM-based coding tools (Claude Code by Anthropic). AI tools were used for the following activities:

  • Code authorship - Implementation of utilities, functions, and classes
  • Test development - Creation of comprehensive unit tests
  • Documentation - Generation of NumPy-style docstrings and README content
  • Code review - Identification of bugs, edge cases, and improvements

Users should evaluate the code for their specific use cases and report any issues through the GitHub issue tracker.


Contributing

See Contributing Guide for guidelines on:

  • Development setup
  • Code style
  • Testing requirements
  • Pull request process

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

enhancedpolars-0.0.4.tar.gz (285.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

enhancedpolars-0.0.4-py3-none-any.whl (171.1 kB view details)

Uploaded Python 3

File details

Details for the file enhancedpolars-0.0.4.tar.gz.

File metadata

  • Download URL: enhancedpolars-0.0.4.tar.gz
  • Upload date:
  • Size: 285.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for enhancedpolars-0.0.4.tar.gz
Algorithm Hash digest
SHA256 9ff30a697fc87bd3afa87584b214b6f59cb377cf480d33083202afc8bd430cca
MD5 8a4472721a8526c350fb54f32eed41af
BLAKE2b-256 7bfcd648ffca2942cbef163fef464051697536f3be14652def58b1dc8f21e9b6

See more details on using hashes here.

Provenance

The following attestation bundles were made for enhancedpolars-0.0.4.tar.gz:

Publisher: release.yml on ruppert20/PolarsUtils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file enhancedpolars-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: enhancedpolars-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 171.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for enhancedpolars-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6937accfe240a63ae2d53d0cbd5ab3d7e08c615cfcc0bb3a4ae0f0407946675a
MD5 f1993dacfe52407f2cc919da1c12a680
BLAKE2b-256 023ae534ee2c596e35cf2fc3d62fdd1bdb256aca1140c4e5f13c9a7dae10e2c2

See more details on using hashes here.

Provenance

The following attestation bundles were made for enhancedpolars-0.0.4-py3-none-any.whl:

Publisher: release.yml on ruppert20/PolarsUtils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page