Skip to main content

Convenient DataFrame column operations for Polars.

Project description

kra

A set of useful tools to work with polars, providing convenient extensions for DataFrame manipulation, column operations, label encoding, and more.

Installation

Build and install the Rust extension and Python API using maturin:

pip install maturin
maturin develop

Or, for development and testing:

pip install nox
nox

Features

  • DataFrame and Series extensions: Add new methods to polars DataFrames and Series.
  • Column utilities: Easily rename, check, and transform DataFrame columns.
  • Label encoding: Encode string labels as categorical/integer values.
  • Dict-of-dicts conversion: Convert between DataFrames and nested dictionaries.

Example Use Cases

1. Dict-of-Dicts Conversion

Convert a DataFrame to a dict of dicts using a column as the key:

import polars as pl
import kra

df = pl.DataFrame({
    "id": [1, 2, 3],
    "name": ["Alice", "Bob", "Charlie"]
})

dod = df.to_dod("id")
# {1: {'id': 1, 'name': 'Alice'}, 2: {'id': 2, 'name': 'Bob'}, ...}

# Convert back:
df2 = kra.from_dod(dod, "id")

2. Column Name Transformations

Transform column names to different cases:

import polars as pl
import kra

df = pl.DataFrame({
    "First Name": [1, 2],
    "Last Name": [3, 4]
})

df_lower = df.cols.to_lowercase()
df_camel = df.cols.to_camelcalse()
df_snake = df.cols.to_snakecase()

3. Label Encoding

Encode string labels as integers:

import polars as pl
import kra

df = pl.DataFrame({
    "label": ["cat", "dog", "cat", "bird"]
})

# Series API
encoded = df["label"].label.encode()

# Expression API (for use in with_columns, etc.)
df2 = df.with_columns(
    pl.col("label").label.encode().alias("encoded_label")
)

4. DataFrame Utilities

Drop columns of type Null:

import polars as pl
import kra

df = pl.DataFrame({
    "a": [1, 2, 3],
    "b": [None, None, None]
})

df_clean = df.drop_null_cols()

5. From Array-like

Create a DataFrame from a numpy array:

import kra
import numpy as np

data = np.array([[1, 2], [3, 4]])
df = kra.from_arraylike(data, schema=["x", "y"], orient="col")

API Reference

  • kra.from_dod: Create DataFrame from dict of dicts.
  • kra.to_dod: Convert DataFrame to dict of dicts.
  • kra.Cols: DataFrame column utilities (access via df.cols).
  • kra.LabelSeries: Series label encoding (access via series.label).
  • kra.LabelExpr: Expression label encoding (access via pl.col(...).label).
  • kra.drop_null_cols: Remove columns of type Null.
  • kra.from_arraylike: Create DataFrame from array-like objects.

For more, see the intro.ipynb notebook.


Rust Extension

kra includes a Rust extension for fast label encoding, accessible via the Python API.


License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kra-0.1.0.tar.gz (10.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kra-0.1.0-py3-none-any.whl (10.1 kB view details)

Uploaded Python 3

File details

Details for the file kra-0.1.0.tar.gz.

File metadata

  • Download URL: kra-0.1.0.tar.gz
  • Upload date:
  • Size: 10.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.6

File hashes

Hashes for kra-0.1.0.tar.gz
Algorithm Hash digest
SHA256 263503d7ed1a834750999faf808c52f573b4beb14b3038aa78509aec96044bc1
MD5 7f62dff7746da6c10a3fdc6fc4668af2
BLAKE2b-256 442df6ec316dd7781f4c1ff82b5c5c2e1a39f28281c24658513d48a25b3b6b8f

See more details on using hashes here.

File details

Details for the file kra-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: kra-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.6

File hashes

Hashes for kra-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c54c60a9704a20aa22c024e61befd5a19eaa716183ddc4fbd534932f8d535941
MD5 f3397ffb6d32c780b03eaeac47cc415e
BLAKE2b-256 93fa66e288a07f74493d0f78e4bd3309f37a677e8130b6daa4eb1c72535c7058

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page