Generalized trie implementation
Project description
gen-tries
Name
gen-tries
Description
A generalized trie implementation for python 3.10 or later that provides classes and functions to create and manipulate a generalized trie data structure.
Unlike many Trie implementations which only support strings as keys and token match only at the character level, it is agnostic as to the types of tokens used to key it and thus far more general purpose.
It requires only that the indexed tokens be hashable (this means that the have
eq and hash methods). This is verified at runtime using the gentrie.Hashable protocol.
Tokens in a key do NOT have to all be the same type as long as they can be compared for equality.
Note that objects of user-defined classes are Hashable by default, but this may not work as naively expected unless they are immutable.
It can handle Sequences of Hashable conforming objects as keys
for the trie out of the box.
As long as the tokens returned by a sequence are hashable, it largely 'just works'.
You can 'mix and match' types of objects used as token in a key as
long as they all conform to the Hashable protocol.
Usage
Example 1 - trie of words:
from gentrie import GeneralizedTrie, TrieEntry
trie = GeneralizedTrie()
entries: list[list[str]] = [
['ape', 'green', 'apple'],
['ape', 'green'],
['ape', 'green', 'pineapple'],
]
for item in entries:
trie.add(item)
prefixes: set[TrieEntry] = trie.prefixes(['ape', 'green', 'apple'])
print(f'prefixes = {prefixes}')
suffixes: set[TrieEntry] = trie.suffixes(['ape', 'green'])
print(f'suffixes = {suffixes}')
# prefixes = {TrieEntry(ident=1, key=['ape', 'green', 'apple']),
# TrieEntry(ident=2, key=['ape', 'green'])}
# suffixes = {TrieEntry(ident=1, key=['ape', 'green', 'apple']),
# TrieEntry(ident=3, key=['ape', 'green', 'pineapple']),
# TrieEntry(ident=2, key=['ape', 'green'])}
Example 2 - trie of tokens from URLs:
from gentrie import GeneralizedTrie, TrieEntry
# Create a trie to store website URLs
url_trie = GeneralizedTrie()
# Add some URLs with different components (protocol, domain, path)
url_trie.add(["https", "com", "example", "www", "/", "products", "clothing"])
url_trie.add(["http", "org", "example", "blog", "/", "2023", "10", "best-laptops"])
url_trie.add(["ftp", "net", "example", "ftp", "/", "data", "images"])
# Find all https URLs with "example.com" domain
suffixes: set[TrieEntry] = url_trie.suffixes(["https", "com", "example"])
print(suffixes)
# suffixes = {TrieEntry(ident=1, key=['https', 'com', 'example', 'www', '/',
# 'products', 'clothing'])}
Example 3 - trie of characters from strings:
from gentrie import GeneralizedTrie, TrieEntry
trie = GeneralizedTrie()
entries: list[str] = [
'abcdef',
'abc',
'abcd',
'qrf',
]
for item in entries:
trie.add(item)
suffixes: set[TrieEntry] = trie.suffixes('abcd')
print(f'suffixes = {suffixes}')
prefixes: set[TrieEntry] = trie.prefixes('abcdefg')
print(f'prefixes = {prefixes}')
# suffixes = {TrieEntry(ident=1, key='abcdef'),
# TrieEntry(ident=3, key='abcd')}
# prefixes = {TrieEntry(ident=1, key='abcdef'),
# TrieEntry(ident=3, key='abcd'),
# TrieEntry(ident=2, key='abc')}
Example 4 - trie of numeric vectors:
from gentrie import GeneralizedTrie, TrieEntry
trie = GeneralizedTrie()
entries = [
[128, 256, 512],
[128, 256],
[512, 1024],
]
for item in entries:
trie.add(item)
suffixes: set[TrieEntry] = trie.suffixes([128])
print(f'suffixes = {suffixes}')
prefixes: set[TrieEntry] = trie.prefixes([128, 256, 512, 1024])
print(f'prefixes = {prefixes}')
# suffixes = {TrieEntry(ident=1, key=[128, 256, 512]),
# TrieEntry(ident=2, key=[128, 256])}
# prefixes = {TrieEntry(ident=1, key=[128, 256, 512]),
# TrieEntry(ident=2, key=[128, 256])}
Example 5 - trie of tuples:
from gentrie import GeneralizedTrie, TrieEntry
trie = GeneralizedTrie()
entries = [
[(1, 2), (3, 4), (5, 6)],
[(1, 2), (3, 4)],
[(5, 6), (7, 8)],
]
for item in entries:
trie.add(item)
suffixes: set[TrieEntry] = trie.suffixes([(1, 2)])
print(f'suffixes = {suffixes}')
prefixes: set[TrieEntry] = trie.prefixes([(1, 2), (3, 4), (5, 6), (7, 8)])
print(f'prefixes = {prefixes}')
# suffixes = {TrieEntry(ident=1, key=[(1, 2), (3, 4), (5, 6)]),
# TrieEntry(ident=2, key=[(1, 2), (3, 4)])}
# prefixes = {TrieEntry(ident=1, key=[(1, 2), (3, 4), (5, 6)]),
# TrieEntry(ident=2, key=[(1, 2), (3, 4)])}
Authors and acknowledgment
- Jerilyn Franz
Copyright
Copyright 2025 by Jerilyn Franz
License
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gen_tries-0.2.1.tar.gz.
File metadata
- Download URL: gen_tries-0.2.1.tar.gz
- Upload date:
- Size: 2.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a6265e1b62858e6c4bbc76584a57550da3fcc12913c7c1374c296770988ddf89
|
|
| MD5 |
3304c5e132484966e468730ac6d516f8
|
|
| BLAKE2b-256 |
efc97d6017e2a85a08cf411e67f3fe803035b8b87277aeb0c002e7656075c98a
|
File details
Details for the file gen_tries-0.2.1-py3-none-any.whl.
File metadata
- Download URL: gen_tries-0.2.1-py3-none-any.whl
- Upload date:
- Size: 14.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
664b086bd9d710a734b9665ce9b2487d042514913a93b0391732c47604bf903f
|
|
| MD5 |
a769d3c5100a355c8410713482984aca
|
|
| BLAKE2b-256 |
ab5ef867bd27bbd094870c48dbb1992d2bdffb621ad438a1411d9899aa9751f1
|