Skip to main content

A pure Python trie data structure implementation.

Project description

pygtrie is a pure Python implementation of a trie data structure compatible with Python 2.x and Python 3.x.

Trie data structure, also known as radix or prefix tree, is a tree associating keys to values where all the descendants of a node have a common prefix (associated with that node).

The trie module contains Trie, CharTrie and StringTrie classes each implementing a mutable mapping interface, i.e. dict interface. As such, in most circumstances, Trie could be used as a drop-in replacement for a dict, but the prefix nature of the data structure is trie’s real strength.

The module also contains PrefixSet class which uses a trie to store a set of prefixes such that a key is contained in the set if it or its prefix is stored in the set.

Features

  • A full mutable mapping implementation.

  • Supports iterating over as well as deleting a subtrie.

  • Supports prefix checking as well as shortest and longest prefix look-up.

  • Extensible for any kind of user-defined keys.

  • A PrefixSet supports “all keys starting with given prefix” logic.

  • Can store any value including None.

Installation

To install pygtrie, simply run:

pip install pygtrie

or by adding line such as:

pygtrie == 2.*

to project’s requirements file. Alternatively, if installation from source is desired, it can be achieved by executing:

python setup.py install

Version History

2.5: TBD

  • Add pygtrie.Trie.merge method which merges structures of two tries.

  • Add pygtrie.Trie.strictly_equals method which compares two tries with stricter rules than regular equality operator. It’s not sufficient that keys and values are the same but the structure of the tries must be the same as well. For example:

    >>> t0 = StringTrie({'foo/bar.baz': 42}, separator='/')
    >>> t1 = StringTrie({'foo/bar.baz': 42}, separator='.')
    >>> t0 == t1
    True
    >>> t0.strictly_equals(t1)
    False
    
  • Fix pygtrie.Trie.__eq__ implementation such that key values are taken into consideration rather than just looking at trie structure. To see what this means it’s best to look at a few examples. Firstly:

    >>> t0 = StringTrie({'foo/bar': 42}, separator='/')
    >>> t1 = StringTrie({'foo.bar': 42}, separator='.')
    >>> t0 == t1
    False
    

    This used to be true since the two tries have the same node structure. However, as far as Mapping interface is concerned, they use different keys, i.e. `set(t0) != set(t1). Secondly:

    >>> t0 = StringTrie({'foo/bar.baz': 42}, separator='/')
    >>> t1 = StringTrie({'foo/bar.baz': 42}, separator='.')
    >>> t0 == t1
    True
    

    This used to be false since the two tries have different node structures (the first one splits key into ('foo', 'bar.baz') while the second into ('foo/bar', 'baz')). However, their keys are the same, i.e. `set(t0) == set(t1). And lastly:

    >>> t0 = Trie({'foo': 42})
    >>> t1 = CharTrie({'foo': 42})
    >>> t0 == t1
    False
    

    This used to be true since the two tries have the same node structure. However, the two classes return key as different values. pygtrie.Trie returns keys as tuples while pygtrie.CharTrie returns them as strings.

2.4.2: 2021/01/03

  • Remove use of ‘super’ in setup.py to fix compatibility with Python 2.7. This changes build code only; no changes to the library itself.

2.4.1: 2020/11/20

  • Remove dependency on packaging module from setup.py to fix installation on systems without that package. This changes build code only; no changes to the library itself. [Thanks to Eric McLachlan for reporting]

2.4.0: 2020/11/19 [pulled back from PyPi]

  • Change children argument of the node_factory passed to pygtrie.Trie.traverse from a generator to an iterator with a custom bool conversion. This allows checking whether node has children without having to iterate over them (bool(children))

    To test whether this feature is available, one can check whether Trie.traverse.uses_bool_convertible_children property is true, e.g.: getattr(pygtrie.Trie.traverse, 'uses_bool_convertible_children', False).

    [Thanks to Pallab Pain for suggesting the feature]

2.3.3: 2020/04/04

  • Fix to ‘AttributeError: _NoChildren object has no attribute sorted_items’ failure when iterating over a trie with sorting enabled. [Thanks to Pallab Pain for reporting]

  • Add value property setter to step objects returned by pygtrie.Trie.walk_towards et al. This deprecates the set method.

  • The module now exports pygtrie.__version__ making it possible to determine version of the library at run-time.

2.3.2: 2019/07/18

  • Trivial metadata fix

2.3.1: 2019/07/18 [pulled back from PyPi]

  • Fix to pygtrie.PrefixSet initialisation incorrectly storing elements even if their prefixes are also added to the set.

    For example, PrefixSet(('foo', 'foobar')) incorrectly resulted in a two-element set even though the interface dictates that only foo is kept (recall that if foo is member of the set, foobar is as well). [Thanks to Tal Maimon for reporting]

  • Fix to pygtrie.Trie.copy method not preserving enable-sorting flag and, in case of pygtrie.StringTrie, separator property.

  • Add support for the copy module so copy.copy can now be used with trie objects.

  • Leafs and nodes with just one child use more memory-optimised representation which reduces overall memory usage of a trie structure.

  • Minor performance improvement for adding new elements to a pygtrie.PrefixSet.

  • Improvements to string representation of objects which now includes type and, for pygtrie.StringTrie object, value of separator property.

2.3: 2018/08/10

  • New pygtrie.Trie.walk_towards method allows walking a path towards a node with given key accessing each step of the path. Compared to pygtrie.Trie.walk_prefixes method, steps for nodes without assigned values are returned.

  • Fix to pygtrie.PrefixSet.copy not preserving type of backing trie.

  • pygtrie.StringTrie now checks and explicitly rejects empty separators. Previously empty separator would be accepted but lead to confusing errors later on. [Thanks to Waren Long]

  • Various documentation improvements, Python 2/3 compatibility and test coverage (python-coverage reports 100%).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pygtrie-2.5.0.tar.gz (39.3 kB view details)

Uploaded Source

Built Distribution

pygtrie-2.5.0-py3-none-any.whl (25.1 kB view details)

Uploaded Python 3

File details

Details for the file pygtrie-2.5.0.tar.gz.

File metadata

  • Download URL: pygtrie-2.5.0.tar.gz
  • Upload date:
  • Size: 39.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.5

File hashes

Hashes for pygtrie-2.5.0.tar.gz
Algorithm Hash digest
SHA256 203514ad826eb403dab1d2e2ddd034e0d1534bbe4dbe0213bb0593f66beba4e2
MD5 4cd66ab6cf0f625cd0fb367712a0cccf
BLAKE2b-256 b91355deec25bf09383216fa7f1dfcdbfca40a04aa00b6d15a5cbf25af8fce5f

See more details on using hashes here.

File details

Details for the file pygtrie-2.5.0-py3-none-any.whl.

File metadata

  • Download URL: pygtrie-2.5.0-py3-none-any.whl
  • Upload date:
  • Size: 25.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.26.0 setuptools/58.5.3 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for pygtrie-2.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8795cda8105493d5ae159a5bef313ff13156c5d4d72feddefacaad59f8c8ce16
MD5 603dea6eaf252d45b4b9fbe7fd2a15f7
BLAKE2b-256 eccdbd196b2cf014afb1009de8b0f05ecd54011d881944e62763f3c1b1e8ef37

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page