Skip to main content

Fast mutable UTF-8 trie exposed as a Python extension module

Project description

trie-hard-py

trie-hard-py is a Python package backed by a Rust trie implementation. It is useful when you need ordered key lookup, prefix search, longest-prefix matching, and mutable map-style updates from Python.

The public module is trie_hard_py; the Rust extension is packaged as trie_hard_py._native.

Features

  • Exact lookup: get, in, []
  • Ordered iteration over (key, value) pairs
  • Prefix search with deterministic lexicographic ordering
  • Longest-prefix matching for autocomplete, gazetteers, dictionaries, and normalized entity lookup
  • Bounded fuzzy matching by Unicode-aware Levenshtein distance
  • Mutable operations: insert, update, remove, discard, clear
  • UTF-8 safe keys; traversal is byte-based and keys are accepted as Python str
  • Typed package metadata via py.typed and __init__.pyi
  • Read-only frozen snapshots with bitmask child lookup inspired by trie-hard

Install For Development

The project uses maturin and PyO3.

python3 -m venv venv
./venv/bin/python -m pip install -U pip
./venv/bin/python -m pip install -r python/requirements.txt pytest
VIRTUAL_ENV="$PWD/venv" ./venv/bin/maturin develop
./venv/bin/python -m pytest -q

Build A Wheel

./venv/bin/maturin build --release
./venv/bin/python -m pip install target/wheels/trie_hard_py-*.whl

Test And Benchmark

cargo test
VIRTUAL_ENV="$PWD/venv" ./venv/bin/maturin develop
./venv/bin/python -m pytest -q
./venv/bin/python benchmarks/bench_pytrie.py

The benchmark uses generated city, country, and medicine-style terms so prefix and fuzzy matching resemble autocomplete and entity lookup workloads. It reports rough RSS deltas for trie construction and writes local JSON history to benchmarks/results/ for regression comparisons. RSS is process-level memory, so treat it as a coarse regression signal rather than an exact object-size measurement.

Usage

from trie_hard_py import PyTrie

trie = PyTrie(["san antonio", "san diego", "san jose", "serbia", "sertraline"])

assert trie.get("san diego") == "san diego"
assert "serbia" in trie
assert "ser" not in trie

trie.insert("amoxicillin tablet", "RXNORM:308182")
trie["amoxicillin oral suspension"] = "RXNORM:308189"
frozen = trie.freeze()

assert list(frozen.prefix_search("amoxicillin")) == [
    ("amoxicillin oral suspension", "RXNORM:308189"),
    ("amoxicillin tablet", "RXNORM:308182"),
]
assert frozen.longest_prefix("amoxicillin tablet 500mg") == (
    "amoxicillin tablet",
    "RXNORM:308182",
)
assert frozen.fuzzy_match("sertralina", max_distance=1) == ("sertraline", "sertraline", 1)

removed = trie.remove("amoxicillin tablet")
assert removed == "RXNORM:308182"

API

PyTrie(items: list[str] | None = None)

When constructed from strings, each key is stored with the same string as its value.

Main methods:

  • get(key) -> str | None
  • get_or(key, default=None) -> str | None
  • contains(key) -> bool
  • starts_with(prefix) -> bool
  • prefix_contains(prefix) -> bool
  • insert(key, value) -> str | None
  • add(key) -> str | None
  • update(items: list[tuple[str, str]]) -> None
  • remove(key) -> str | None
  • discard(key) -> bool
  • clear() -> None
  • keys() -> list[str]
  • values() -> list[str]
  • items() -> list[tuple[str, str]]
  • prefix_search(prefix) -> Iterator[tuple[str, str]]
  • longest_prefix(query) -> tuple[str, str] | None
  • fuzzy_search(query, max_distance=2, limit=10) -> list[tuple[str, str, int]]
  • fuzzy_match(query, max_distance=2) -> tuple[str, str, int] | None
  • freeze() -> PyFrozenTrie

Python mapping helpers are also supported: len(trie), bool(trie), key in trie, trie[key], trie[key] = value, del trie[key], and iteration over (key, value) pairs.

PyFrozenTrie is a read-only snapshot returned by PyTrie.freeze(). It supports the read APIs (get, contains, prefix_search, longest_prefix, fuzzy_search, iteration, and mapping-style reads) but does not expose mutation.

Engineering Notes

The core is currently a mutable byte trie implemented in Rust with arena-stored nodes and sorted compact child lists. Values are stored as shared Arc<str> handles, so frozen snapshots can reuse value strings instead of copying them. This keeps iteration deterministic without allocating a map object per node. Deleted branches are detached from their parents; arena slots are retained so existing node indexes stay stable.

Frozen snapshots path-compress unary chains into radix labels before applying adaptive child lookup. Nodes with high fanout get an external 256-bit child mask, so exact lookup checks whether the queried byte is present and computes the child offset with count_ones, following the same rank-by-bitmask idea used by Cloudflare's trie-hard. Low-fanout nodes avoid the fixed mask cost and use binary search over contiguous children instead. Benchmark both mutable and frozen forms for your workload.

The current frozen node layout intentionally keeps per-node metadata together. An experimental structure-of-arrays layout was tested, but on the benchmark dataset it used more RSS than the adaptive node layout, so it is not used.

It intentionally does not depend on Cloudflare's trie-hard crate because that crate is optimized for bulk-loaded read-mostly maps; this package exposes mutable Python operations.

Fuzzy search uses bounded Levenshtein distance and short-circuits candidates that already exceed max_distance. Distances are counted over Unicode scalar values rather than UTF-8 bytes.

CI builds and tests the extension on Linux, macOS, and Windows using GitHub Actions and uploads wheel artifacts for each platform.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trie_hard_py-0.1.0.tar.gz (19.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

trie_hard_py-0.1.0-cp313-cp313-win_amd64.whl (169.4 kB view details)

Uploaded CPython 3.13Windows x86-64

trie_hard_py-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (310.0 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

trie_hard_py-0.1.0-cp313-cp313-macosx_11_0_arm64.whl (273.6 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

trie_hard_py-0.1.0-cp312-cp312-win_amd64.whl (169.3 kB view details)

Uploaded CPython 3.12Windows x86-64

trie_hard_py-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (311.3 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

trie_hard_py-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (273.7 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

trie_hard_py-0.1.0-cp311-cp311-win_amd64.whl (170.8 kB view details)

Uploaded CPython 3.11Windows x86-64

trie_hard_py-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (312.9 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

trie_hard_py-0.1.0-cp311-cp311-macosx_11_0_arm64.whl (275.0 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

trie_hard_py-0.1.0-cp310-cp310-win_amd64.whl (170.8 kB view details)

Uploaded CPython 3.10Windows x86-64

trie_hard_py-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (313.3 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

trie_hard_py-0.1.0-cp310-cp310-macosx_11_0_arm64.whl (275.1 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

trie_hard_py-0.1.0-cp39-cp39-win_amd64.whl (173.3 kB view details)

Uploaded CPython 3.9Windows x86-64

trie_hard_py-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (316.0 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

trie_hard_py-0.1.0-cp39-cp39-macosx_11_0_arm64.whl (277.4 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file trie_hard_py-0.1.0.tar.gz.

File metadata

  • Download URL: trie_hard_py-0.1.0.tar.gz
  • Upload date:
  • Size: 19.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for trie_hard_py-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2688111f58b3d55e111f9fd98f5ec1fa45e79645445d9994e0fe0fdaa60ac711
MD5 814e22340bffd683c1ca3fd18a1db476
BLAKE2b-256 e8c6a28585a4feaf96e75160093a62944a74511251a5156cabb18f53e8fdfe26

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0.tar.gz:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trie_hard_py-0.1.0-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for trie_hard_py-0.1.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 4636a950191a1ead876ec8bb26ea1d1e9c1659ba429044610c85266f1a8d7e84
MD5 4cf41a59b1c664fb1d80eb87a611cd4b
BLAKE2b-256 59f2db17d7fd3d4c3cd05eb58226cb66c106180191d83c5c3c2b81fac8e19a34

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0-cp313-cp313-win_amd64.whl:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trie_hard_py-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for trie_hard_py-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4813b031aa41b76a14ca0a0fb36d37b4c84f7f4259cf1addab02f53f713a2235
MD5 191bbcd2f3250b9eabf8f1226a7cd64f
BLAKE2b-256 0a84c5c659157641a58438e1c12a9a8e55bbd98f01a5638c8919e0f10310ed6e

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trie_hard_py-0.1.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for trie_hard_py-0.1.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5e18f9cdc4e9d5364574685d15db0a8eaed01538a0420918187cd8726efeb9ce
MD5 e5c942098beb5207dbcc34eb8405d24f
BLAKE2b-256 ecebd024b355df84c6e01a54ba84fdbf4ee9a706a4cb2766c9711801ba60e03b

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0-cp313-cp313-macosx_11_0_arm64.whl:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trie_hard_py-0.1.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for trie_hard_py-0.1.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 88530d5507b4fe9fca5411923c6e3b775dd036ff9abb66e0f8ee9aabf7ffa87d
MD5 5afd5c55f1a0ecca6571767289270b5f
BLAKE2b-256 5c2faa7bb87d7ad58f729c1c034ba86326f49aec657c6b16384cb4c7855c900f

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0-cp312-cp312-win_amd64.whl:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trie_hard_py-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for trie_hard_py-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5ec98bf8e0ea0643bb32e4a84be67837eb2775223429d4389eaaaf4639fc9cb3
MD5 0285de431393d947d179f1fb02ac1570
BLAKE2b-256 4f320fa9ed332dfabbe85584c4464b90f458e75bd06faba295110c113db4aa94

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trie_hard_py-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for trie_hard_py-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1de4a9c7362ddccf438f229d334e5e6c6360040124c24fdcaffb65adea53f030
MD5 8e569064459b120aacf0e7d69749e7a8
BLAKE2b-256 528ed0d8832148804439c43ce892e6aab26f8adff7a927f3afaf0acb2911c705

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trie_hard_py-0.1.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for trie_hard_py-0.1.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 d3fd2309698064658af315934ecb6806bd168e12608897235ff2fd470f4b027a
MD5 a716a4bc1cfbbb56dc11d4e7ac9f64e3
BLAKE2b-256 6237d40bff897b1532ab05e9299a8581a7a227bad011c678132a02639cf2a5c9

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0-cp311-cp311-win_amd64.whl:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trie_hard_py-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for trie_hard_py-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9183427abe0a5911698863a109b561351a3c967cbf43cbd7809abb7b8dfe9537
MD5 2c99e7c6bb7bab5c5304683597ba8b9d
BLAKE2b-256 167f42d970bbfc811ce647fbe9a38eb379ceb04029c4b65167dfb041e4c8ff4c

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trie_hard_py-0.1.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for trie_hard_py-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 79092464ee599099c316384c403bbc4392ad689538796bd729313f014691e786
MD5 822b9c1c7b2250b39323f47d56765903
BLAKE2b-256 dce5cd4df1410973386ccd86970a296cd28affeb13a17551aec2b2810ddb3d34

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trie_hard_py-0.1.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for trie_hard_py-0.1.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 4290f5a8e4c85c92a5af682277bc0434dc9157b01a930906bc7b6d707cf2ed51
MD5 91461c209a49b0f4c0c0167470153b91
BLAKE2b-256 4c855e67438a367fc62c3688ad62376876d27a4ca99638a13bc28d507324d3c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0-cp310-cp310-win_amd64.whl:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trie_hard_py-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for trie_hard_py-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ab98f6aa6caffa9a0750de126bce4c4bebf43607afe950bd539d0e00a7bc7760
MD5 703fff7d192f8758e0c4177d70c45fa1
BLAKE2b-256 e2a17156a7846299ccd942a24f7fda0d6ad6a04fee30d9b574b96232c97e3408

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trie_hard_py-0.1.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for trie_hard_py-0.1.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 99c9aa5fadd71231c8f153dfe2d532457a5fca9c80e406e13aa9bcf6ed78aa2e
MD5 fd5dfa6413df487cb555cdf289f239a9
BLAKE2b-256 429cec7e61abdb32759cee9a2e55f279a8ea5effdb5c547edbfea12317282fba

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trie_hard_py-0.1.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: trie_hard_py-0.1.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 173.3 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for trie_hard_py-0.1.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 b196a8aa5c35a1efe611d2ad6bbce78ca39da69e74e11e6666ad8807062d83bb
MD5 beff481d62d8dca80da88fdef6922f91
BLAKE2b-256 8cb82b6bf38b87c8ad91339821a990d8efceddb7adae5d479c10485bfab53ff5

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0-cp39-cp39-win_amd64.whl:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trie_hard_py-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for trie_hard_py-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e294a712b1ce0c815c5e838ab34895637f0c6737d4bcb49b1a1d0a9d006a8722
MD5 fd4787ea0d9f86350fd6bbe501e847f2
BLAKE2b-256 ffe2937bbfe089acaf9916f0c6814dd965b1bbdfc25657c8e3863b8c06d1c49d

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file trie_hard_py-0.1.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for trie_hard_py-0.1.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8f751c2ba5ae235122f56659c2053607520ccdc08f6a9e5acd5a57d9c8aa64f5
MD5 336d7c0fadf7f40a92d93a5c3f275679
BLAKE2b-256 41534cb2ec64ef76e479cfc1ae919ebe6484b04c78bade3774bb61591d543b07

See more details on using hashes here.

Provenance

The following attestation bundles were made for trie_hard_py-0.1.0-cp39-cp39-macosx_11_0_arm64.whl:

Publisher: release.yml on imvladikon/trie_hard_py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page