Skip to main content

Rust-accelerated array algorithms for Polars DataFrames.

Project description

Polars Array Algorithms

High-performance array algorithms for Polars, implemented in Rust with Python bindings.

Features

  • Sweep-line algorithm for interval scheduling and resource assignment
  • Optimized Rust implementation with PyO3 bindings
  • Full type hints and comprehensive tests
  • O(n log n) time complexity

Installation

pip install -e .

For optimized release build:

make install-release

Quick Start

import polars as pl
import polars_array_algorithms as pl_alg

df = pl.DataFrame({
    "start": [10, 20, 15, 30],
    "end": [25, 35, 28, 40],
})

result = df.select(
    "start", "end",
    room_id=pl_alg.sweep_line_assignment("start", "end", overlapping=False)
)
print(result)

Output:

shape: (4, 3)
┌───────┬─────┬─────────┐
│ start ┆ end ┆ room_id │
│ ---   ┆ --- ┆ ---     │
│ i64   ┆ i64 ┆ u32     │
╞═══════╪═════╪═════════╡
│ 10    ┆ 25  ┆ 1       │
│ 20    ┆ 35  ┆ 2       │
│ 15    ┆ 28  ┆ 2       │
│ 30    ┆ 40  ┆ 1       │
└───────┴─────┴─────────┘

API Reference

sweep_line_assignment(start, end, overlapping=False) -> Expr

Assigns the minimum number of resources to intervals using a sweep-line algorithm.

Parameters:

  • start: Start times (expression, column name, or Series)
  • end: End times (same type as start)
  • overlapping: If False (default), intervals [start, end) can share resources if endpoints touch. If True, intervals [start, end] need separate resources if endpoints touch.

Returns: UInt32 Series with resource IDs (1-indexed)

Supported types: Any type which can be physically represented as a signed or unsigned integer. (int8-int64, uint8-uint64, Date, Datetime)

Complexity: O(n log n) time, O(n) space

Example:

import polars as pl
import polars_array_algorithms as pl_alg

# Non-overlapping intervals
df = pl.DataFrame({
    "start": [10, 20],
    "end": [20, 30],
})
df.select(room=pl_alg.sweep_line_assignment("start", "end", overlapping=False))
# Returns room=[1, 1] - both intervals can share the same room

# Overlapping intervals
df.select(room=pl_alg.sweep_line_assignment("start", "end", overlapping=True))
# Returns room=[1, 2] - intervals at same endpoint need different rooms

Development

Setup

make venv
make install

Testing

make test

Code Quality

make pre-commit

Building

make install          # Debug build
make install-release  # Optimized build

Project Structure

src/
├── lib.rs              # PyO3 module definition
└── sweep_line.rs       # Algorithm implementation

polars_array_algorithms/
├── __init__.py         # Python API
├── typing.py           # Type definitions
└── _internal.pyi       # Type stubs

tests/                  # Pytest tests
Makefile                # Build commands

Algorithm Details

Sweep-Line Algorithm

Solves the interval scheduling problem by finding the minimum number of resources needed so no two overlapping intervals share the same resource.

How it works:

  1. Create events for interval starts and ends
  2. Sort events by time (departures before arrivals in non-overlapping mode)
  3. Process events, assigning lowest available resource ID or allocating new one
  4. Return resource to pool when interval ends

Tie-breaking:

  • Non-overlapping: departures at time T < arrivals at time T (allows reuse)
  • Overlapping: arrivals at time T < departures at time T (requires separate resources)

Performance

Metric Value
Time Complexity O(n log n)
Space Complexity O(n)
Optimality Always minimum resources

Requirements

  • Python >= 3.8
  • Polars >= 0.52.0
  • Rust (for building)

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_array_algorithms-0.1.1.tar.gz (33.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

polars_array_algorithms-0.1.1-cp38-abi3-win_amd64.whl (4.5 MB view details)

Uploaded CPython 3.8+Windows x86-64

polars_array_algorithms-0.1.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

polars_array_algorithms-0.1.1-cp38-abi3-macosx_11_0_arm64.whl (4.3 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

polars_array_algorithms-0.1.1-cp38-abi3-macosx_10_12_x86_64.whl (4.5 MB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file polars_array_algorithms-0.1.1.tar.gz.

File metadata

File hashes

Hashes for polars_array_algorithms-0.1.1.tar.gz
Algorithm Hash digest
SHA256 4e1d301cddd0708bd3103ce3c0d58a0d1dbb024fb24cafbf203460c54a0ab810
MD5 dea7f9443cbccb40d0d0a5bbb606b45e
BLAKE2b-256 0d45d5c96ccad208a8fe7cd12bd24693250956589b16fb3d69a7d8b8d1a79ec4

See more details on using hashes here.

File details

Details for the file polars_array_algorithms-0.1.1-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for polars_array_algorithms-0.1.1-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 66426f075501cdc8c02eb946e98ec090faaada0d554d7a42992e0b9924fdf0df
MD5 dbb9fcfaefc649a3691ca4646067959c
BLAKE2b-256 a27c4feda6f6947e4b8c140dea26d807cd0b468f5f6d0de0196cbcb15280031d

See more details on using hashes here.

File details

Details for the file polars_array_algorithms-0.1.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for polars_array_algorithms-0.1.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 721841d55c73d2c233a616c9def07516536f7894eeab2a4f16f9b8e25e97442b
MD5 1ce92d5875918103052ae073340b9d2c
BLAKE2b-256 3e01537e9289c7a5ac4b11e2b5e26e4b621ca9eb24ff4b367b62e64f84e5ee2c

See more details on using hashes here.

File details

Details for the file polars_array_algorithms-0.1.1-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for polars_array_algorithms-0.1.1-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 53a78e236df6a8c2bfc7fb042ba85add3e665f44b15e23b1df249514667b5c1b
MD5 9bee06c9ee9add853873946bfdb2565c
BLAKE2b-256 250290152ed16821c0cded1de955e86d2cf629a4eb35dea0b1bc382a570c4e82

See more details on using hashes here.

File details

Details for the file polars_array_algorithms-0.1.1-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for polars_array_algorithms-0.1.1-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 c0c2f27d45d0aa03b4315cc1f9086d313e0b3fddef411ff56b506f712b1f09be
MD5 731a6c98016d93ab157901b0b398ec79
BLAKE2b-256 c995175f8e753c58fff55d032d78a6adac9c071e5c51ecdce825c138a599450e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page