Skip to main content

Community detection via Louvain/Leiden + Genetic Algorithm

Project description

TAU Community Detection

PyPI License: MIT Python 3.10+

tau-community-detection implements TAU, an evolutionary community detection algorithm that couples genetic search with Leiden refinements. It is designed for scalable graph clustering with configurable hyper-parameters and multiprocessing support.


Highlights

  • Evolutionary search: Maintains a population of candidate partitions and applies crossover/mutation tailored for graph clustering.
  • Leiden optimization: Refines every candidate with Leiden to ensure modularity gains.
  • Multiprocessing aware: Utilises worker pools for population optimization.
  • Deterministic options: Accepts a user-specified random seed for reproducibility.
  • Simple API: Access everything through the TauClustering class.

Installation

The project targets Python 3.10 or newer.

pip install tau-community-detection

To work from a clone, install the package in editable mode inside a virtual environment:

git clone https://github.com/HillelCharbit/community_TAU.git
cd community_TAU
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install -e .

Quick Start (Python API)

from tau_community_detection import TauClustering
import networkx as nx

graph = nx.read_adjlist("path/to/graph.adjlist")

clustering = TauClustering(
    graph,
    population_size=80,
    max_generations=250,
)
vertex_clustering = clustering.run()

print("community for node 0:", vertex_clustering.membership[0])
print("best modularity:", vertex_clustering.modularity)

Need detailed per-generation metrics? Call run(track_stats=True) to receive (vertex_clustering, generation_stats) where generation_stats is a list of dictionaries containing runtime and fitness diagnostics (including per-generation modularity).


Graph input

To optimize for very large graphs or when using many worker processes, it is recommended to pass a file path (e.g., to an .adjlist or edge list file) directly to TauClustering rather than a pre-loaded graph object. This allows efficient memory sharing.

Supported input:

  • File path to a graph in common NetworkX or igraph format (auto-detects weighting and structure).
  • Already-loaded networkx.Graph or igraph.Graph objects.

By default, the loader auto-detects whether the graph is weighted based on the file or graph structure. You can override this by setting TauConfig(is_weighted=True/False) when constructing TauClustering; if your override disagrees with the detected type, a warning is issued and auto-detection is used.

Examples:

from tau_community_detection import TauClustering

# Recommended for large graphs:
clustering = TauClustering("mygraph.graph", population_size=40, max_generations=30)

# In-memory NetworkX graph:
import networkx as nx
g = nx.read_adjlist("mygraph.adjlist")
clustering2 = TauClustering(g, population_size=40, max_generations=30)

# Force unweighted input (ignore/strip weights if present) via config:
from tau_community_detection import TauConfig
custom_config = TauConfig(is_weighted=False)
clustering3 = TauClustering("mygraph.graph", population_size=40, max_generations=30, config=custom_config)

For details on accepted formats, see below.


Configuration

All algorithm hyper-parameters live on the TauConfig dataclass. You can pass a custom configuration instance to TauClustering or adjust attributes on the default one. Key fields include:

  • population_size: number of partitions maintained per generation.
  • max_generations: upper bound on evolutionary iterations.
  • elite_fraction / immigrant_fraction: govern selection pressure.
  • stopping_generations / stopping_jaccard: convergence checks based on membership stability.
  • random_seed: makes runs reproducible across processes.

See src/tau_community_detection/config.py for the complete list.


Development

pip install -r requirements-dev.txt
make lint
make test

To build local distributions:

make build

Continuous Integration

  • GitHub Actions run lint, tests, and package builds on pushes and pull requests.
  • Set the CODECOV_TOKEN secret to upload coverage reports.

Publishing

  1. Bump the version in setup.cfg/pyproject.toml and commit.
  2. Tag the release with git tag vX.Y.Z && git push --tags.
  3. Run the Publish Package workflow (defaults to TestPyPI). For PyPI, supply the pypi input and ensure PYPI_API_TOKEN is set. Use TEST_PYPI_API_TOKEN for dry runs.

License

Released under the MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tau_community_detection-1.2.8.tar.gz (17.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tau_community_detection-1.2.8-py3-none-any.whl (14.0 kB view details)

Uploaded Python 3

File details

Details for the file tau_community_detection-1.2.8.tar.gz.

File metadata

  • Download URL: tau_community_detection-1.2.8.tar.gz
  • Upload date:
  • Size: 17.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for tau_community_detection-1.2.8.tar.gz
Algorithm Hash digest
SHA256 d8a794265350f172d805da9bbfd302e769ca451b47d750bfac1a7512a4b441d2
MD5 f02441b076c50115b8d13ae7dd784e64
BLAKE2b-256 c2a0ae7fbfe93a265ea97a250812c7c5458492edff51f28013d2e7600365b889

See more details on using hashes here.

File details

Details for the file tau_community_detection-1.2.8-py3-none-any.whl.

File metadata

File hashes

Hashes for tau_community_detection-1.2.8-py3-none-any.whl
Algorithm Hash digest
SHA256 7b7e050d53b3d47fc95baed090cd82ce625723eca3c83dee3ac0e27fa4f4fe1b
MD5 c9a2745afbc3e25c05050607f2b3c0c5
BLAKE2b-256 63da7c812fce52a64b5e260873f855a799a4a21673f806f75a0759edada8bebd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page