Skip to main content

Rust-accelerated array algorithms for Polars DataFrames.

Project description

Polars Array Algorithms

High-performance array algorithms for Polars, implemented in Rust with Python bindings.

Features

  • Sweep-line algorithm for interval scheduling and resource assignment
  • Optimized Rust implementation with PyO3 bindings
  • Full type hints and comprehensive tests
  • O(n log n) time complexity

Installation

pip install -e .

For optimized release build:

make install-release

Quick Start

import polars as pl
import polars_array_algorithms as pl_alg

df = pl.DataFrame({
    "start": [10, 20, 15, 30],
    "end": [25, 35, 28, 40],
})

result = df.select(
    "start", "end",
    room_id=pl_alg.sweep_line_assignment("start", "end", overlapping=False)
)
print(result)

Output:

shape: (4, 3)
┌───────┬─────┬─────────┐
│ start ┆ end ┆ room_id │
│ ---   ┆ --- ┆ ---     │
│ i64   ┆ i64 ┆ u32     │
╞═══════╪═════╪═════════╡
│ 10    ┆ 25  ┆ 1       │
│ 20    ┆ 35  ┆ 2       │
│ 15    ┆ 28  ┆ 2       │
│ 30    ┆ 40  ┆ 1       │
└───────┴─────┴─────────┘

API Reference

sweep_line_assignment(start, end, overlapping=False) -> Expr

Assigns the minimum number of resources to intervals using a sweep-line algorithm.

Parameters:

  • start: Start times (expression, column name, or Series)
  • end: End times (same type as start)
  • overlapping: If False (default), intervals [start, end) can share resources if endpoints touch. If True, intervals [start, end] need separate resources if endpoints touch.

Returns: UInt32 Series with resource IDs (1-indexed)

Supported types: Any type which can be physically represented as a signed or unsigned integer. (int8-int64, uint8-uint64, Date, Datetime)

Complexity: O(n log n) time, O(n) space

Example:

import polars as pl
import polars_array_algorithms as pl_alg

# Non-overlapping intervals
df = pl.DataFrame({
    "start": [10, 20],
    "end": [20, 30],
})
df.select(room=pl_alg.sweep_line_assignment("start", "end", overlapping=False))
# Returns room=[1, 1] - both intervals can share the same room

# Overlapping intervals
df.select(room=pl_alg.sweep_line_assignment("start", "end", overlapping=True))
# Returns room=[1, 2] - intervals at same endpoint need different rooms

Development

Setup

make venv
make install

Testing

Python

make test

Rust

cargo test --no-default-features

Note: these are the pure rust unit tests which do not require python bindings.

Code Quality

make pre-commit

Building

make install          # Debug build
make install-release  # Optimized build

Project Structure

src/
├── lib.rs              # PyO3 module definition
└── sweep_line.rs       # Algorithm implementation

polars_array_algorithms/
├── __init__.py         # Python API
├── typing.py           # Type definitions
└── _internal.pyi       # Type stubs

tests/                  # Pytest tests
Makefile                # Build commands

Algorithm Details

Sweep-Line Algorithm

Solves the interval scheduling problem by finding the minimum number of resources needed so no two overlapping intervals share the same resource.

How it works:

  1. Create events for interval starts and ends
  2. Sort events by time (departures before arrivals in non-overlapping mode)
  3. Process events, assigning lowest available resource ID or allocating new one
  4. Return resource to pool when interval ends

Tie-breaking:

  • Non-overlapping: departures at time T < arrivals at time T (allows reuse)
  • Overlapping: arrivals at time T < departures at time T (requires separate resources)

Performance

Metric Value
Time Complexity O(n log n)
Space Complexity O(n)
Optimality Always minimum resources

Requirements

  • Python >= 3.8
  • Polars >= 0.52.0
  • Rust (for building)

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_array_algorithms-0.1.2.tar.gz (39.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

polars_array_algorithms-0.1.2-cp38-abi3-win_amd64.whl (4.6 MB view details)

Uploaded CPython 3.8+Windows x86-64

polars_array_algorithms-0.1.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.3 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

polars_array_algorithms-0.1.2-cp38-abi3-macosx_11_0_arm64.whl (4.4 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

polars_array_algorithms-0.1.2-cp38-abi3-macosx_10_12_x86_64.whl (4.6 MB view details)

Uploaded CPython 3.8+macOS 10.12+ x86-64

File details

Details for the file polars_array_algorithms-0.1.2.tar.gz.

File metadata

File hashes

Hashes for polars_array_algorithms-0.1.2.tar.gz
Algorithm Hash digest
SHA256 a4815815d8d718e790ad4a58eb8e1569549aa233d2d5827f3c5dfa73dc69d66a
MD5 7eab10355c9ef08e1e5048f66ce0d94b
BLAKE2b-256 515cc84aa3bcb815c405c8d467ab8ab4d6c80a7c5eedc8dd3f93ec3263b3a67e

See more details on using hashes here.

File details

Details for the file polars_array_algorithms-0.1.2-cp38-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for polars_array_algorithms-0.1.2-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 62f3a134153088e1c46601ca90abbb610cb63c141d971fb9f746fc438ecea22d
MD5 9c7dd3fb36e7392c45cc7d86bc8c4d0c
BLAKE2b-256 94a67270a8c5e566502a3b4a42c8024dd487941bb9b75a4eb93bcbf93b759c23

See more details on using hashes here.

File details

Details for the file polars_array_algorithms-0.1.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for polars_array_algorithms-0.1.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 524cc7ef0d3291178a584bef076d87f329219b214622ebb4634d4ce3c0172ad8
MD5 ee890e6b6469a8cbb57b1118d870531b
BLAKE2b-256 e5cd15f64fadb2973b1807e6be27e9251115bbfcc7a1cd6a4d1794cc38b5245e

See more details on using hashes here.

File details

Details for the file polars_array_algorithms-0.1.2-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for polars_array_algorithms-0.1.2-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f13a4bda520530408918b64c84459eb469587ddad641a152156e092f364a59bb
MD5 b933b3840b7b23df1a529e267100a3fa
BLAKE2b-256 71318e2b6102e4f76d598eed1b66ed65ee1134070006e4cb340b89d6c1cdee86

See more details on using hashes here.

File details

Details for the file polars_array_algorithms-0.1.2-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for polars_array_algorithms-0.1.2-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 dfbeb3f06e92942cfb5be9cd76c66c2f8472642899d535d093335b7ee38485db
MD5 e2dc6dbc58fd845501267cd86d3ad3d0
BLAKE2b-256 6f6950ef47bada6bd5b82200ff36c8eb852ce81a33d1424629a5b28122bcce4e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page