Mining graphs with Subgroup Discovery
Project description
GraphSD
GraphSD (Graph-based Subgroup Discovery) is a Python package for detecting exceptional interaction patterns in graphs. It builds spatio-temporal graphs from position and attribute data, then applies rule-based subgroup discovery and outlier detection techniques to uncover meaningful and rare behaviors.
✨ Features
- Directed and multi-directed interaction graph construction
- Subgroup discovery using interpretable rule-based conditions
- Outlier detection and quality-based ranking
- Spatio-temporal interaction filtering using distance and velocity
- Binning and discretization utilities
- Built-in graph visualizations with pattern overlays
- Pure Python: no dependency on Orange3 or external mining engines
📦 Installation
Install via PyPI:
pip install graph-sd
🚀 Example Usage
from graphsd.mining import DigraphSDMining
from graphsd.utils import make_bins
from graphsd._base import load_data
from graphsd.viz import graph_viz
import networkx as nx
# Load sample position and social data
position_df, social_df = load_data("playground_a")
# Discretize social attributes
social_df = make_bins(social_df)
# Initialize the subgroup discovery engine
dig = DigraphSDMining(random_state=42)
# Build the interaction graph using position and attribute data
dig.read_data(position_df, social_df, time_step=10)
# Discover subgroups with quality constraints
subgroups = dig.subgroup_discovery(
mode="to",
min_support=0.2,
metric="mean",
quality_measure="global_proportion"
)
# Convert to a DataFrame and print
df = dig.to_dataframe(subgroups)
print(df)
# Visualize the graph and highlighted subgroups
graph_viz(dig.graph, layout=nx.spring_layout)
🧠 Code Structure
Module | Purpose |
---|---|
mining.py |
Main API for graph construction and subgroup discovery |
patterns.py |
Logic for rule quality, coverage, and pattern filters |
outlier.py |
Tools for subgroup scoring and ranking |
utils.py |
Preprocessing, binning, and distance computations |
viz.py |
Graph and subgroup visualizations |
_base.py |
Sample data loader (e.g. load_data("playground_a") ) |
📄 License
This project is licensed under the BSD 3-Clause License.
👥 Authors
- Carolina Centeio Jorge – TU Delft
- Cláudio Rebelo de Sá – Leiden University
🌐 Links
📚 Citation
If you use GraphSD in your research, please cite:
📝 Journal Article (Expert Systems, 2023)
Jorge, C.C., Atzmueller, M., Heravi, B.M., Gibson, J.L., Rossetti, R.J.F., & Rebelo de Sá, C.
"Want to come play with me?" Outlier subgroup discovery on spatio-temporal interactions.
Expert Systems, 40(5), 2023.
https://doi.org/10.1111/exsy.12686
@article{DBLP:journals/es/JorgeAHGRS23,
author = {Carolina Centeio Jorge and Martin Atzmueller and Behzad Momahed Heravi and
Jenny L. Gibson and Rosaldo J. F. Rossetti and Cl{'a}udio Rebelo de S{'a}},
title = {"Want to come play with me?" Outlier subgroup discovery on spatio-temporal interactions},
journal = {Expert Syst. J. Knowl. Eng.},
volume = {40},
number = {5},
year = {2023},
doi = {10.1111/EXSY.12686}
}
📘 Conference Paper (EPIA 2019)
Jorge, C.C., Atzmueller, M., Heravi, B.M., Gibson, J.L., Rebelo de Sá, C., & Rossetti, R.J.F.
Mining Exceptional Social Behaviour. In EPIA 2019, LNCS 11805, Springer.
https://doi.org/10.1007/978-3-030-30244-3_38
@inproceedings{DBLP:conf/epia/JorgeAHGSR19,
author = {Carolina Centeio Jorge and Martin Atzmueller and Behzad Momahed Heravi and
Jenny L. Gibson and Cl{'a}udio Rebelo de S{'a} and Rosaldo J. F. Rossetti},
title = {Mining Exceptional Social Behaviour},
booktitle = {Progress in Artificial Intelligence - 19th EPIA 2019},
series = {Lecture Notes in Computer Science},
volume = {11805},
pages = {460--472},
publisher = {Springer},
year = {2019},
doi = {10.1007/978-3-030-30244-3_38}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file graph_sd-0.3.1.tar.gz
.
File metadata
- Download URL: graph_sd-0.3.1.tar.gz
- Upload date:
- Size: 773.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
98d19df57a648b4491020f2bdac7cdb660874bd9ff9b872f55f21c51aaae8b12
|
|
MD5 |
956de214db3d9791ae3618a9a7d85729
|
|
BLAKE2b-256 |
e3a494cc6f0c7ff57e058de34f32cf49d02ea36663cf5792dc9ccf7994c550eb
|
File details
Details for the file graph_sd-0.3.1-py3-none-any.whl
.
File metadata
- Download URL: graph_sd-0.3.1-py3-none-any.whl
- Upload date:
- Size: 770.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
aba6f9fbdd93a1b432e3632cc3d0b2fa3153e58e12af3d6aa342cb8b9b616b34
|
|
MD5 |
733c7fbfb2dcb3877118852b6c96da5a
|
|
BLAKE2b-256 |
9b9f2babeb4546a7e39e148ed6bdaa92a835dfab2f448ff380ea616a7d71c621
|