High-performance graph analysis and pattern mining extension for Polars
Project description
PolarsGrouper
PolarsGrouper is a Rust-based extension for Polars that provides efficient graph analysis capabilities, with a focus on component grouping and network analysis.
Core Features
Component Grouping
super_merger
: Easy-to-use wrapper for grouping connected componentssuper_merger_weighted
: Component grouping with weight thresholds- Efficient implementation using Rust and Polars
- Works with both eager and lazy Polars DataFrames
Additional Graph Analytics
- Shortest Path Analysis: Find shortest paths between nodes
- PageRank: Calculate node importance scores
- Betweenness Centrality: Identify key bridge nodes
- Association Rules: Discover item relationships and patterns
Installation
pip install polars-grouper
# For development:
python -m venv .venv
source .venv/bin/activate
maturin develop
Usage Examples
Basic Component Grouping
The core functionality uses super_merger
to identify connected components:
import polars as pl
from polars_grouper import super_merger
df = pl.DataFrame({
"from": ["A", "B", "C", "D", "E", "F"],
"to": ["B", "C", "A", "E", "F", "D"],
"value": [1, 2, 3, 4, 5, 6]
})
result = super_merger(df, "from", "to")
print(result)
Weighted Component Grouping
For cases where edge weights matter:
from polars_grouper import super_merger_weighted
df = pl.DataFrame({
"from": ["A", "B", "C", "D", "E"],
"to": ["B", "C", "D", "E", "A"],
"weight": [0.9, 0.2, 0.05, 0.8, 0.3]
})
result = super_merger_weighted(
df,
"from",
"to",
"weight",
weight_threshold=0.3
)
print(result)
Additional Graph Analytics
Shortest Path Analysis
Find shortest paths between nodes:
from polars_grouper import calculate_shortest_path
df = pl.DataFrame({
"from": ["A", "A", "B", "C"],
"to": ["B", "C", "C", "D"],
"weight": [1.0, 2.0, 1.0, 1.5]
})
paths = df.select(
calculate_shortest_path(
pl.col("from"),
pl.col("to"),
pl.col("weight"),
directed=False
).alias("paths")
).unnest("paths")
PageRank Calculation
Calculate node importance:
from polars_grouper import page_rank
df = pl.DataFrame({
"from": ["A", "A", "B", "C", "D"],
"to": ["B", "C", "C", "A", "B"]
})
rankings = df.select(
page_rank(
pl.col("from"),
pl.col("to"),
damping_factor=0.85
).alias("pagerank")
).unnest("pagerank")
Association Rule Mining
Discover item relationships:
from polars_grouper import graph_association_rules
transactions = pl.DataFrame({
"transaction_id": [1, 1, 1, 2, 2, 3],
"item_id": ["A", "B", "C", "B", "D", "A"],
"frequency": [1, 2, 1, 1, 1, 1]
})
rules = transactions.select(
graph_association_rules(
pl.col("transaction_id"),
pl.col("item_id"),
pl.col("frequency"),
min_support=0.1
).alias("rules")
).unnest("rules")
Betweenness Centrality
Identify bridge nodes:
from polars_grouper import betweenness_centrality
df = pl.DataFrame({
"from": ["A", "A", "B", "C", "D", "E"],
"to": ["B", "C", "C", "D", "E", "A"]
})
centrality = df.select(
betweenness_centrality(
pl.col("from"),
pl.col("to"),
normalized=True
).alias("centrality")
).unnest("centrality")
Performance
The library is implemented in Rust for high performance:
- Efficient memory usage
- Fast computation for large graphs
- Seamless integration with Polars' lazy evaluation
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file polars_grouper-0.3.0.tar.gz
.
File metadata
- Download URL: polars_grouper-0.3.0.tar.gz
- Upload date:
- Size: 30.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 76707a74ab55cca25b1c5066a293a29ae48baac3cc0db152983ed5230feeb622 |
|
MD5 | f1a992fddbe2c2f83929a0dd6abccb04 |
|
BLAKE2b-256 | 98b5a5bc4de4f288f34483cae6a98ea65a26bfee34f15f2c381ecae52e345d03 |
File details
Details for the file polars_grouper-0.3.0-cp38-abi3-win_amd64.whl
.
File metadata
- Download URL: polars_grouper-0.3.0-cp38-abi3-win_amd64.whl
- Upload date:
- Size: 3.5 MB
- Tags: CPython 3.8+, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1fc5e028c0bb1c2e3e5d18d4357da3b06502cfcfd14061d23e15520dcf7caa5e |
|
MD5 | 73ef96b7d906d01585ab1e0cf0ba3ff3 |
|
BLAKE2b-256 | e2c66515349d6d39ea395a04d45c91cbba599cff993cf8991e5aed3c5f9b255a |
File details
Details for the file polars_grouper-0.3.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: polars_grouper-0.3.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.8+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 447974f42782c9998a49d70e82caf124c39c69094a5d845d4d3778e084409ebc |
|
MD5 | a493241e0a154cb8c5b00c599c08a7be |
|
BLAKE2b-256 | 59a7a76c6f6ac2bdc8ed3f45c98886d00f6eb661f5ba6408268e0ac06f3cbae5 |
File details
Details for the file polars_grouper-0.3.0-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl
.
File metadata
- Download URL: polars_grouper-0.3.0-cp38-abi3-manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 4.4 MB
- Tags: CPython 3.8+, manylinux: glibc 2.17+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 29cb97892720b464a9109c31229a4df086665cfcad7390a90c0417fcbfb0b9fd |
|
MD5 | 1b12186f0f62c4351877289d70ae45cc |
|
BLAKE2b-256 | 6500dac0055ce0aec0eb7fc6ac7807690ecad84fff23b27e29b0ad48ebd22b14 |
File details
Details for the file polars_grouper-0.3.0-cp38-abi3-macosx_11_0_arm64.whl
.
File metadata
- Download URL: polars_grouper-0.3.0-cp38-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 3.4 MB
- Tags: CPython 3.8+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3701fea159f2104d78e8aaad65c2af698275a8b8aa036a8c1d98ef18de06a822 |
|
MD5 | 30b7e09809e6df11dbbcd3e491485655 |
|
BLAKE2b-256 | bf741d5e452b71b7e08615adf3a15d799f593d5cf16382e2bded1d9c50566b8f |
File details
Details for the file polars_grouper-0.3.0-cp38-abi3-macosx_10_12_x86_64.whl
.
File metadata
- Download URL: polars_grouper-0.3.0-cp38-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 3.6 MB
- Tags: CPython 3.8+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a2c56eb4621502447268c2d40bfc7696fe291691fe777b257cdda869bfbdde2 |
|
MD5 | 7b9ea57acdff57566132dd16367eba8e |
|
BLAKE2b-256 | 668481fd9b5a35668684cfd84b9902283d25aa83857acbaf88237ccaeae24819 |