Skip to main content

A Generalized Suffix Tree for any iterable, with Lowest Common Ancestor retrieval

Project description

py39 py310 py311 py312 pypy39 coverage

A Generalized Suffix Tree for any Python sequence, with Lowest Common Ancestor retrieval.

pip install suffix-tree
>>> from suffix_tree import Tree

>>> tree = Tree({"A": "xabxac"})
>>> tree.find("abx")
True
>>> tree.find("abc")
False

This suffix tree:

  • works with any Python sequence, not just strings, if the items are hashable,

  • is a generalized suffix tree for sets of sequences,

  • is implemented in pure Python,

  • builds the tree in time proportional to the length of the input,

  • does constant-time Lowest Common Ancestor retrieval.

Being implemented in Python this tree is not very fast nor memory efficient. The building of the tree takes time proportional to the length of the string of symbols. The query time is proportional to the length of the query string.

To get the best performance turn the python optimizer on: python -O.

Documentation: https://cceh.github.io/suffix-tree/

PyPi: https://pypi.org/project/suffix-tree/

Usage examples:

>>> from suffix_tree import Tree
>>> tree = Tree()
>>> tree.add(1, "xabxac")
>>> tree.add(2, "awyawxawxz")
>>> tree.find("abx")
True
>>> tree.find("awx")
True
>>> tree.find("abc")
False
>>> tree = Tree({"A": "xabxac", "B": "awyawxawxz"})
>>> tree.find_id("A", "abx")
True
>>> tree.find_id("B", "abx")
False
>>> tree.find_id("B", "awx")
True
>>> tree = Tree(
...     {
...         "A": "sandollar",
...         "B": "sandlot",
...         "C": "handler",
...         "D": "grand",
...         "E": "pantry",
...     }
... )
>>> for k, length, path in tree.common_substrings():
...     print(k, length, path)
...
2 4 s a n d
3 3 a n d
4 3 a n d
5 2 a n
>>> tree = Tree({"A": "xabxac", "B": "awyawxawxz"})
>>> for C, path in sorted(tree.maximal_repeats()):
...     print(C, path)
...
1 a w
1 a w x
2 a
2 x
2 x a

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

suffix_tree-0.1.2.tar.gz (686.0 kB view details)

Uploaded Source

Built Distribution

suffix_tree-0.1.2-py3-none-any.whl (32.6 kB view details)

Uploaded Python 3

File details

Details for the file suffix_tree-0.1.2.tar.gz.

File metadata

  • Download URL: suffix_tree-0.1.2.tar.gz
  • Upload date:
  • Size: 686.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for suffix_tree-0.1.2.tar.gz
Algorithm Hash digest
SHA256 e5a927d205956c0161ed4fc90959e55326f67bccea1291fa476b408f347d92ef
MD5 c2da967dc4610cf54ad7dcd4fcd6a7b3
BLAKE2b-256 32eb932d92d37e56629d1a9fb5126cd07d3bdc5166a6ea654fb06d4e83961c93

See more details on using hashes here.

File details

Details for the file suffix_tree-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: suffix_tree-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 32.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for suffix_tree-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c7f01f9bf16f88eed830d6be97b3448648c48ea763cf0092a1f29dccd2b13862
MD5 58ad92487b52e035b6e7b9f41089e940
BLAKE2b-256 a6b08f6adff523678f5cf23da20f6b3a00727c97287dca39746f2ffdb4ef1b06

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page