Skip to main content

Build, validate, and query .tactdict language packs

Project description

Tact Dictionary

Tact Dictionary builds, validates, and queries .tactdict language packs for text-entry software.

The tact-dictionary tool and related Python package can prepare source data, build binary packs, validate pack files, and inspect completions and variants. The project was built with the Tact Android keyboard in mind, but the keyboard is an adopter of the format rather than the only intended consumer. Current packs contain a normalized lexicon graph, word records, prefix suggestion caches, and variant groups for long-press word alternatives.

The tool and Python package are licensed under the MIT License. Generated language packs are data artifacts with source-derived licensing and attribution recorded in the sidecar manifest. The code license does not relicense frequency lists, morphology data, blocklists, or other input datasets.

Status

This project is pre-alpha. The tool can produce and validate an English pack, and the Python API can inspect completions and variants.

Implemented pack sections:

  • STRINGS: shared UTF-8 string pool
  • SYMBOLS: normalized input symbols
  • WORDS: word records and source masks
  • VARIANTS: variant groups such as run, runs, running, runner
  • SUGGESTS: prefix suggestion lists attached to graph states
  • GRAPH: lexicon word graph

Not implemented yet: context models, full typo-correction graph data, blocklist sections, user overlays, and runtime memory-map readers.

Development

uv sync
uv run tact-dictionary --version
uv run ruff format --check src tests
uv run ruff check
uv run pytest

Build Python distribution artifacts with:

uv build

Build A Pack

Download the current English source files:

Convert external source files into tact-dictionary build inputs:

uv run tact-dictionary import-language-data \
  --output-dir build/en_US \
  --leipzig-corpus sources/eng_news_2025_100K.tar.gz \
  --morphynet-inflection sources/eng.inflectional.v1.tsv \
  --morphynet-derivation sources/eng.derivational.v1.tsv

Build and validate the pack:

uv run tact-dictionary build --config build/en_US/en_US.tactbuild
uv run tact-dictionary validate build/en_US/generated/en_US.tactdict

The build writes:

  • build/en_US/generated/en_US.tactdict
  • build/en_US/generated/en_US.manifest.json
  • build/en_US/generated/en_US.symbols.json

See Building Language Packs for input file formats and build configuration details.

Query A Pack

Python consumers can inspect a built pack without shelling out:

from tact_dictionary.query import DictionaryLookup

lookup = DictionaryLookup.from_file("en_US.tactdict")
for completion in lookup.find_completions("ru"):
    print(completion.surface, completion.kind, completion.completion_cost)

for variant in lookup.find_variants("run"):
    print(variant.surface, variant.relation_to_primary)

The same lookup is available from the CLI:

tact-dictionary --file en_US.tactdict find-completions --limit 12 ru
tact-dictionary --file en_US.tactdict find-variants run

Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tact_dictionary-0.2.0.tar.gz (111.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tact_dictionary-0.2.0-py3-none-any.whl (52.5 kB view details)

Uploaded Python 3

File details

Details for the file tact_dictionary-0.2.0.tar.gz.

File metadata

  • Download URL: tact_dictionary-0.2.0.tar.gz
  • Upload date:
  • Size: 111.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.11

File hashes

Hashes for tact_dictionary-0.2.0.tar.gz
Algorithm Hash digest
SHA256 9e79e2e44f387e6b5c6ca8433f26f0171dbd5a2a31f83059681ded83a6f09fc3
MD5 b37d9692f5edd378316c94a7d865913f
BLAKE2b-256 abd6bbf281c422686a8edd873465266671144621227b39950a023ca8f83b148d

See more details on using hashes here.

File details

Details for the file tact_dictionary-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for tact_dictionary-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cd6611b9a7165c7fe11ac0fe74bd2f4a6fb2cf5d7a879824745b79c9d7bf4476
MD5 ca6d70b2d9dca2e0a3c976963882d900
BLAKE2b-256 0f32ec7315bbbc7a6c8ef4d83e2f361bc0fac2d21e01a8feb56a950230374040

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page