Skip to main content

Enhanced utilities for Polars DataFrames

Project description

EnhancedPolars

Enhanced utilities for Polars DataFrames, providing pandas-like convenience while maintaining Polars' blazing-fast performance.

Features

  • Unified Namespace (epl) - Access all extensions through df.epl.*
  • Pandas-like Indexing - loc, iloc, at, iat accessors
  • Enhanced Merging - Automatic dtype resolution and asof joins
  • ML Pipeline Tools - Standardization, encoding, and preprocessing
  • SQL Integration - Direct DataFrame upload to databases
  • Time Series Utilities - Interpolation, boundaries, and cohort analysis
  • Statistical Functions - Hypothesis tests and descriptive statistics
  • Full LazyFrame Support - All operations work with lazy evaluation

Installation

pip install EnhancedPolars

Or with uv:

uv add EnhancedPolars

Quick Start

import polars as pl
from enhancedpolars import epl
from enhancedpolars.register import *  # Register the 'epl' namespace

# Read data with automatic type optimization
df = epl.read_csv("data.csv", cleanup=True)

# Pandas-like indexing
subset = df.epl.loc[0:10, ["col1", "col2"]]
value = df.epl.at[0, "col1"]

# Enhanced merging with automatic dtype resolution
result = df.epl.merge(other_df, on="key", how="left")

# Time-aware asof join
trades = trades_df.epl.merge_asof(prices_df, on="timestamp", strategy="backward")

# GroupBy with pandas-style syntax
summary = df.epl.groupby(["category"]).agg({
    "value": ["mean", "sum", "max"],
    "count": "sum"
})

# ML preprocessing
df_ready, metadata = df.epl.make_ml_ready(
    target_col="label",
    default_numeric_scaler="StandardScaler"
)

# SQL upload
df.epl.to_sql(connection, "table_name", if_exists="replace")

Documentation

Document Description
Getting Started Installation and quick start guide
API Reference Complete API documentation
Examples Practical usage examples
Contributing Contributing guidelines and style guide

Key Modules

Indexing

df.epl.loc[0]                      # First row
df.epl.loc[0:5, ["a", "b"]]        # Slice with columns
df.epl.iloc[0, 0]                  # Position-based access
df.epl.at[0, "column"]             # Single value access

Merging

df.epl.merge(right, on="key", how="left")
df.epl.merge_asof(right, on="timestamp", by="group")
df.epl.concat(df2, df3, how="vertical")

GroupBy

df.epl.groupby(["col"]).agg({"value": ["mean", "sum"]})
df.epl.groupby(["col"]).apply(custom_function)
df.epl.groupby(["col"]).ffill(columns=["value"])

Interpolation

df.epl.ffill(columns=["value"])
df.epl.bfill(columns=["value"])
df.epl.interpolate(columns=["value"], method="linear")
df.epl.fillna(columns=["value"], value=0)

ML Pipeline

df.epl.standardize(columns=["numeric_col"])
df.epl.clip_and_impute(columns=["value"], impute_strategy="median")
df.epl.make_ml_ready(target_col="label")

# Series-level operations
series.epl.scale_encode(path="scaler.joblib", scaler_type="StandardScaler")
series.epl.isnull()  # Handles both null and NaN

SQL

df.epl.to_sql(connection, "table_name", if_exists="replace", batch_size=10000)

Requirements

Core Dependencies

  • Python 3.12+
  • polars >= 1.33.0
  • numpy >= 2.3.0
  • pandas >= 2.3.0
  • pyarrow >= 21.0.0
  • Plus: tqdm, python-dateutil, joblib

Optional Dependencies

Install with extras for additional functionality:

# For scientific computing (interpolation, hypothesis tests)
pip install "EnhancedPolars[sci]"

# For ML preprocessing (scalers, encoders)
pip install "EnhancedPolars[ml]"

# Install all optional dependencies
pip install "EnhancedPolars[all]"
  • [sci] - Scientific computing: scipy >= 1.16.0
  • [ml] - Machine learning: scikit-learn >= 1.7.0

License

This project is licensed under the MIT License - see the LICENSE file for details.


Author

@Ruppert20


AI Authorship Disclaimer

This package was developed with the assistance of LLM-based coding tools (Claude Code by Anthropic). AI tools were used for the following activities:

  • Code authorship - Implementation of utilities, functions, and classes
  • Test development - Creation of comprehensive unit tests
  • Documentation - Generation of NumPy-style docstrings and README content
  • Code review - Identification of bugs, edge cases, and improvements

Users should evaluate the code for their specific use cases and report any issues through the GitHub issue tracker.


Contributing

See Contributing Guide for guidelines on:

  • Development setup
  • Code style
  • Testing requirements
  • Pull request process

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

enhancedpolars-0.0.2.tar.gz (279.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

enhancedpolars-0.0.2-py3-none-any.whl (170.6 kB view details)

Uploaded Python 3

File details

Details for the file enhancedpolars-0.0.2.tar.gz.

File metadata

  • Download URL: enhancedpolars-0.0.2.tar.gz
  • Upload date:
  • Size: 279.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for enhancedpolars-0.0.2.tar.gz
Algorithm Hash digest
SHA256 bd160dc9b23eb7bc196648bb734b90098d453878dcf276218fe9f54cc1a302ca
MD5 a7998425d8c3b4466df255cffa23c691
BLAKE2b-256 5d36358c248bfc840777b034302d69393a49a39512ea669e214b3b5b9d946fe8

See more details on using hashes here.

Provenance

The following attestation bundles were made for enhancedpolars-0.0.2.tar.gz:

Publisher: release.yml on ruppert20/PolarsUtils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file enhancedpolars-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: enhancedpolars-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 170.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for enhancedpolars-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 abc432a8990059a2b5f77a5be764ae8de65ddb0a7b20cf32337529451e29153d
MD5 c20a00d75ea489a0bfe26158f50af93d
BLAKE2b-256 e87bc6aa30275090bb5f774b2d891cdbc8fd04ccb4e51bbe76ab06b3fa5adba8

See more details on using hashes here.

Provenance

The following attestation bundles were made for enhancedpolars-0.0.2-py3-none-any.whl:

Publisher: release.yml on ruppert20/PolarsUtils

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page