Skip to main content

High-performance graph analysis and pattern mining extension for Polars

Project description

PolarsGrouper

PolarsGrouper is a Rust-based extension for Polars that provides efficient graph analysis capabilities, with a focus on component grouping and network analysis.

Core Features

Component Grouping

  • super_merger: Easy-to-use wrapper for grouping connected components
  • super_merger_weighted: Component grouping with weight thresholds
  • Efficient implementation using Rust and Polars
  • Works with both eager and lazy Polars DataFrames

Additional Graph Analytics

  • Shortest Path Analysis: Find shortest paths between nodes
  • PageRank: Calculate node importance scores
  • Betweenness Centrality: Identify key bridge nodes
  • Association Rules: Discover item relationships and patterns

Installation

pip install polars-grouper

# For development:
python -m venv .venv
source .venv/bin/activate
maturin develop

Usage Examples

Basic Component Grouping

The core functionality uses super_merger to identify connected components:

import polars as pl
from polars_grouper import super_merger

df = pl.DataFrame({
    "from": ["A", "B", "C", "D", "E", "F"],
    "to": ["B", "C", "A", "E", "F", "D"],
    "value": [1, 2, 3, 4, 5, 6]
})

result = super_merger(df, "from", "to")
print(result)

Weighted Component Grouping

For cases where edge weights matter:

from polars_grouper import super_merger_weighted

df = pl.DataFrame({
    "from": ["A", "B", "C", "D", "E"],
    "to": ["B", "C", "D", "E", "A"],
    "weight": [0.9, 0.2, 0.05, 0.8, 0.3]
})

result = super_merger_weighted(
    df, 
    "from", 
    "to", 
    "weight",
    weight_threshold=0.3
)
print(result)

Additional Graph Analytics

Shortest Path Analysis

Find shortest paths between nodes:

from polars_grouper import calculate_shortest_path

df = pl.DataFrame({
    "from": ["A", "A", "B", "C"],
    "to": ["B", "C", "C", "D"],
    "weight": [1.0, 2.0, 1.0, 1.5]
})

paths = df.select(
    calculate_shortest_path(
        pl.col("from"),
        pl.col("to"),
        pl.col("weight"),
        directed=False
    ).alias("paths")
).unnest("paths")

PageRank Calculation

Calculate node importance:

from polars_grouper import page_rank

df = pl.DataFrame({
    "from": ["A", "A", "B", "C", "D"],
    "to": ["B", "C", "C", "A", "B"]
})

rankings = df.select(
    page_rank(
        pl.col("from"),
        pl.col("to"),
        damping_factor=0.85
    ).alias("pagerank")
).unnest("pagerank")

Association Rule Mining

Discover item relationships:

from polars_grouper import graph_association_rules

transactions = pl.DataFrame({
    "transaction_id": [1, 1, 1, 2, 2, 3],
    "item_id": ["A", "B", "C", "B", "D", "A"],
    "frequency": [1, 2, 1, 1, 1, 1]
})

rules = transactions.select(
    graph_association_rules(
        pl.col("transaction_id"),
        pl.col("item_id"),
        pl.col("frequency"),
        min_support=0.1
    ).alias("rules")
).unnest("rules")

Betweenness Centrality

Identify bridge nodes:

from polars_grouper import betweenness_centrality

df = pl.DataFrame({
    "from": ["A", "A", "B", "C", "D", "E"],
    "to": ["B", "C", "C", "D", "E", "A"]
})

centrality = df.select(
    betweenness_centrality(
        pl.col("from"),
        pl.col("to"),
        normalized=True
    ).alias("centrality")
).unnest("centrality")

Performance

The library is implemented in Rust for high performance:

  • Efficient memory usage
  • Fast computation for large graphs
  • Seamless integration with Polars' lazy evaluation

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_grouper-0.4.0.tar.gz (36.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

polars_grouper-0.4.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.5 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

polars_grouper-0.4.0-pp311-pypy311_pp73-manylinux_2_17_i686.manylinux2014_i686.whl (5.8 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ i686

polars_grouper-0.4.0-cp310-abi3-win_amd64.whl (4.8 MB view details)

Uploaded CPython 3.10+Windows x86-64

polars_grouper-0.4.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.5 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

polars_grouper-0.4.0-cp310-abi3-manylinux_2_17_i686.manylinux2014_i686.whl (5.8 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ i686

polars_grouper-0.4.0-cp310-abi3-macosx_11_0_arm64.whl (4.5 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

polars_grouper-0.4.0-cp310-abi3-macosx_10_12_x86_64.whl (4.8 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file polars_grouper-0.4.0.tar.gz.

File metadata

  • Download URL: polars_grouper-0.4.0.tar.gz
  • Upload date:
  • Size: 36.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.12.6

File hashes

Hashes for polars_grouper-0.4.0.tar.gz
Algorithm Hash digest
SHA256 6b478b97362d1d384025c687999d43a9d8d3839790901e703a2a29f4b2b0ba96
MD5 c981f7b14028651947b3a7a9dc981d6e
BLAKE2b-256 175ca625a86c16630525265c8048dab6c8193b3c22ed77a1fb70f7826d55b68d

See more details on using hashes here.

File details

Details for the file polars_grouper-0.4.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for polars_grouper-0.4.0-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ffd7545e8e5ea90ab2f2b0652aeab884726bae496a89d8a4f4c7c995c93661ea
MD5 fd1aa31e9ecb95c8a8f5bc5f743f4cf8
BLAKE2b-256 1bcaab2c28cff3bb10b13b3192831334aa06fe0e0c746a13c658e41130163961

See more details on using hashes here.

File details

Details for the file polars_grouper-0.4.0-pp311-pypy311_pp73-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for polars_grouper-0.4.0-pp311-pypy311_pp73-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 9c9ac28c7bad9a6deb74f5455a0ab8e822b7097fc8355ae139d1ef2b2a357514
MD5 908a22ab1bd1c0907820e1dc4fc9c6d9
BLAKE2b-256 8b53ba058673819ec9044dd53d5f4aa6cbd17c5ab61bce0077c6eb991f43bbb1

See more details on using hashes here.

File details

Details for the file polars_grouper-0.4.0-cp310-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for polars_grouper-0.4.0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 f1ee61faa007fc8ea738b9df47c6b78591092793209b028a989875c81e7686fb
MD5 28d40508c6e1d5dc81d23d40cf168b95
BLAKE2b-256 99841c84b317bf066dc094951427eda18091cb3386358d22b1bbe67d6f75dabd

See more details on using hashes here.

File details

Details for the file polars_grouper-0.4.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for polars_grouper-0.4.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a4019bebbb63b37ed797acb3a44abe3465b921001ffc1b7f86742ee2e914c558
MD5 85860f149de306d55cb498867fdab153
BLAKE2b-256 df4d0fa29d6700fc3f152b6eb9d02655033ab8146385f9da7b8eb634c2cf35ed

See more details on using hashes here.

File details

Details for the file polars_grouper-0.4.0-cp310-abi3-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for polars_grouper-0.4.0-cp310-abi3-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 64b2c87b27c184353c948d2895d30afe0c9e3dd615b22b07dbf96a5ec225548b
MD5 96dd2b1fa6bb1f7c83bffba93ce67e4f
BLAKE2b-256 9bb46a9768a0a5f5ef51776978db785f8fe795883861d83a6692d85ca6853829

See more details on using hashes here.

File details

Details for the file polars_grouper-0.4.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for polars_grouper-0.4.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 49b35e1536f9651b46089daf5d97195419aba68ba4c34e6d0d02be80005c2a97
MD5 0c8c2c63bea18fffc9b80729920a8cdd
BLAKE2b-256 3bdf140fadd0cc27d207bb225370b0442ce22bd326ba735785768279c383f5cf

See more details on using hashes here.

File details

Details for the file polars_grouper-0.4.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for polars_grouper-0.4.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 86d034faf6d2a8d25d73265c8d0cfc3f9d14d9bf0a0b530ee07a3d3c73bccc09
MD5 44fa3144a97e85b32c44e27347849961
BLAKE2b-256 049c3783066285a6693f443719fc706b481f6d61cba3db714dfb1d07b87b0954

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page