Skip to main content

Disjoint-Set implementation with heuristic optimizations (union-by-rank and path-compression)

Project description

Python Version License

A Disjoint-Set implementation which makes use of the the “Union by Rank” and “Path Compression” heuristic optimizations while being as simple and effective as possible.

It support the following operations:

  • Find-Set in O(1) amortized running time

  • Union in O(1) amortized running time

Check end notes for additional information about running times

Why?

Sometimes, specially during algorithm competitions, a Union-Find data structure is needed to solve a certain problem (e.g. when implementing Kruskal’s MST algorithm) In those cases, we usually use whatever off-the-shelf implementation we happen to find in PyPI and, many times, we get the infamous submission fail Time Limit Exceed(TLE) because that specific Disjoint-Set implementation is not fast enough…

Worry not! A fast and simple enough (you can copy paste the code here) disjoint-set implementation is now available:

Installation

$ pip install dset

Usage

>>> import dset
>>> s1 = dset.Set()                 # To create new sets objects (it runs in O(1))
>>> s2 = dset.Set()
>>> s3 = dset.Set()
>>> dset.groups()                   # To check the number of different groups (it runs in O(1))
3
>>> dset.find(s1) == dset.find(s2)  # To check if two sets are in the same group
False
>>> dset.union(s1, s2)              # to group s1 and s2 sets together
>>> dset.groups()                   # Now there are only 2 groups: s1-s2 and s3
2
>>> dset.find(s1) == dset.find(s2)  # They are in the same group s1-s2
True
>>> dset.find(s2) == dset.find(s3)  # They are in distinct groups s1-s2 and s3
False
>>> dset.union(s2, s1)              # do nothing (no need to check if s1 and s2 belong to the same group)
>>> dset.groups()                   # Same as before only 2: s1-s2 and s3
2
>>> dset.union(s1, s3)              # group s1-s2 and s3 together.
>>> dset.groups()                   # Now there is only one group: s1-s2-s3
1
>>> dset.find(s1) == dset.find(s2)  # All the sets are in the same groups s1-s2-s3
... dset.find(s2) == dset.find(s3)
... dset.find(s1) == dset.find(s3)
True
True
True

Future improvements

  • Make official Sphinx Documentation. (minor releases)

  • Make test suite. (minor releases)

  • Change to a list-based implementation (major-minor release)
    • Improve running time (constant terms)

    • Improve memory space (constant terms)

  • Make the package C-based. (major-minor release)
    • Improve running time (constant terms)

Notes on Complexity Running Times

Separately, either “Union by Rank” or “Path Compression” improves the running time of the operations on disjoint-sets. Alone, “Union by Rank” yields a running time of O(m log n), where m is the number of Union and Find-Set operations and n is the number of sets.

The improvement is even greater when we use the two heuristics together; the worst-case running time is O(m f(n)), where f(n) is a very slowly growing function, the Inverse Ackermann function (e.g. for n equal to the number atoms in the universe, f(n) is only 4) So for any conceivable application, we can consider the running time of m Union + Find-Set operations to be O(m) and therefore both Union and Find-Set operations have O(1) amortized running time in practice.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dset-0.0.1.tar.gz (5.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dset-0.0.1-py3-none-any.whl (4.0 kB view details)

Uploaded Python 3

File details

Details for the file dset-0.0.1.tar.gz.

File metadata

  • Download URL: dset-0.0.1.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.2

File hashes

Hashes for dset-0.0.1.tar.gz
Algorithm Hash digest
SHA256 cd65b3260908a2218dd41252c361021aea7a5da8c72079b2b1672e54faf8e781
MD5 2b47fe2fec195c7b8e85d33cece7b7d3
BLAKE2b-256 5d645aeecb73379149f1a2e53e946d6083c1bed11f489597ca14aaafb3ba5242

See more details on using hashes here.

File details

Details for the file dset-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: dset-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 4.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.2

File hashes

Hashes for dset-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fdca85c0ba033725006a9617383a26f1023230bb8bbca69f4ab25d824599b838
MD5 849984553bdca4cb095a9b41425c79a2
BLAKE2b-256 60a797c81486d66cd7ec47d413f128c66dd00dbc51bee32e6165e05310561703

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page