Skip to main content

Python library for Nufi (Fe'éfě'e) text: Clafrica keyboard mapping, Bana→Komako normalisation, low-tone stripping, and encoding repair

Project description

nuficlean

Python library for Nufi (Fe'éfě'e / Babanki-Tungo) text utilities:

  • Bana → Komako normalisation — converts Bana orthography to the standard Komako form, strips low-tone diacritics, and repairs encoding issues
  • Clafrica keyboard mapping — converts ASCII shortcut sequences (Clafrica input method) into the corresponding Nufi Unicode characters

Install

pip install nuficlean

Bana normalisation

clean(text)

Applies the full normalisation pipeline to a string.

from nuficlean import clean

clean("kòlə̀'")        # → "kwele'"
clean("nàh")           # → "lah"
clean("mɛ̀ɛ̀")         # → "maa"
clean("tōh mēndɑ̀'")  # → "tōh mēndɑ'"

clean_lines(lines)

Cleans a list of strings.

from nuficlean import clean_lines

clean_lines(["kòlə̀'", "nàh", "mɛ̀ɛ̀"])
# → ["kwele'", "lah", "maa"]

Pipeline

  1. Mojibake repair — fixes Latin-1 → UTF-8 misencoding
  2. Apostrophe / quote unification — maps ', `, ʼ, ", «, » → ASCII
  3. Bana → Komako rewrite — longest-match substitution (kòlə̀'kwele', ɛ̀a, …)
  4. Low-tone stripping — removes grave-accent tone marks (àa, ɑ̀ɑ, …)
  5. NFC recomposition

CLI

nuficlean "kòlə̀'"
echo "mɛ̀ɛ̀" | nuficlean

Clafrica keyboard mapping

The Clafrica input method uses ASCII shortcut sequences to type Nufi characters. nuficlean ships the canonical mapping table and exposes it through two functions and a class.

apply_clafrica(text)

Converts all Clafrica shortcuts in text to Unicode, preserving whitespace.

from nuficlean import apply_clafrica

apply_clafrica("af1 e2 n*")   # → "ɑ̀ é ŋ"
apply_clafrica("eu3 af5")     # → "ə̄ ɑ̂"
apply_clafrica("uu1 o*2")     # → "ʉ̀ ɔ́"
apply_clafrica("N* O*")       # → "Ŋ Ɔ"

Live-typing mode — pass preserve_ambiguous_trailing=True to leave the last token untouched while the user may still extend it:

apply_clafrica("af", preserve_ambiguous_trailing=True)  # → "af"  (could become af1, af2…)
apply_clafrica("af1")                                   # → "ɑ̀"

finalize_clafrica(text)

Like apply_clafrica but also resolves any trailing ambiguous shortcut — use this when the user confirms input (e.g. presses Enter or Space).

from nuficlean import finalize_clafrica

finalize_clafrica("eu3")   # → "ə̄"
finalize_clafrica("af1")   # → "ɑ̀"
finalize_clafrica("n*")    # → "ŋ"

ClafricaEngine — advanced use

Instantiate the engine directly when you need a custom mapping or extra entries.

from nuficlean import ClafricaEngine

# Add project-specific shortcuts on top of the default table
engine = ClafricaEngine(extra={"nkap": "ŋkɑ̄p"})
engine.apply_mapping("nkap e2")   # → "ŋkɑ̄p é"
engine.finalize_input("eu3")      # → "ə̄"
engine.lookup("af1")              # → "ɑ̀"
engine.lookup("xyz")              # → None

# Fully custom table (replaces the default)
engine = ClafricaEngine(mapping={"a1": "à", "e1": "è"})

Shortcut reference

Shortcut Output Notes
af ɑ open-a
eu ə schwa
ai ε epsilon
o* ɔ open-o
uu ʉ u-bar
n* ŋ eng
N* Ŋ Eng (uppercase)
a1 a2 a3 à á ā low / mid / high tone
af1 af2 af3 ɑ̀ ɑ́ ɑ̄ open-a tones
eu1 eu2 eu3 ə̀ ə́ ə̄ schwa tones
o*1 o*2 o*3 ɔ̀ ɔ́ ɔ̄ open-o tones

Tone digits: 1 = low, 2 = mid, 3 = high, 5 = rising, 7 = falling.

Tip: The clafrica package on PyPI provides the same keyboard mapping as a standalone library if you don't need the Bana normalisation. pip install clafrica


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nuficlean-0.2.1.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nuficlean-0.2.1-py3-none-any.whl (14.0 kB view details)

Uploaded Python 3

File details

Details for the file nuficlean-0.2.1.tar.gz.

File metadata

  • Download URL: nuficlean-0.2.1.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for nuficlean-0.2.1.tar.gz
Algorithm Hash digest
SHA256 25c943a128cfb6df29fc81acef9ea64feed11758bd0ed4314d0358dcbd6fd81a
MD5 9fc76546265fd6e7f18ad7442d496dc5
BLAKE2b-256 3044c58da6e085c12b224dc4aae33a7550b0eaade8fe1a825be447a8130d5214

See more details on using hashes here.

File details

Details for the file nuficlean-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: nuficlean-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 14.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for nuficlean-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5f9374c98fd68a125f3c46832a1e5adb25092814413a77952b6931169a787739
MD5 7c5a6b018c42fceef2017fed1be050cf
BLAKE2b-256 63b2aed7551b56386779748ed5ceb56531a4a74f205db95a8a8271f93c1cc53d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page