Skip to main content

Python library for Nufi (Fe'éfě'e) text: Clafrica keyboard mapping, Bana→Komako normalisation, low-tone stripping, and encoding repair

Project description

nuficlean

Python library for Nufi (Fe'éfě'e / Babanki-Tungo) text utilities:

  • Bana → Komako normalisation — converts Bana orthography to the standard Komako form, strips low-tone diacritics, and repairs encoding issues
  • Clafrica keyboard mapping — converts ASCII shortcut sequences (Clafrica input method) into the corresponding Nufi Unicode characters

Install

pip install nuficlean

Bana normalisation

clean(text)

Applies the full normalisation pipeline to a string.

from nuficlean import clean

clean("kòlə̀'")        # → "kwele'"
clean("nàh")           # → "lah"
clean("mɛ̀ɛ̀")         # → "maa"
clean("tōh mēndɑ̀'")  # → "tōh mēndɑ'"

clean_lines(lines)

Cleans a list of strings.

from nuficlean import clean_lines

clean_lines(["kòlə̀'", "nàh", "mɛ̀ɛ̀"])
# → ["kwele'", "lah", "maa"]

Pipeline

  1. Mojibake repair — fixes Latin-1 → UTF-8 misencoding
  2. Apostrophe / quote unification — maps ', `, ʼ, ", «, » → ASCII
  3. Bana → Komako rewrite — longest-match substitution (kòlə̀'kwele', ɛ̀a, …)
  4. Low-tone stripping — removes grave-accent tone marks (àa, ɑ̀ɑ, …)
  5. NFC recomposition

CLI

nuficlean "kòlə̀'"
echo "mɛ̀ɛ̀" | nuficlean

Clafrica keyboard mapping

The Clafrica input method uses ASCII shortcut sequences to type Nufi characters. nuficlean ships the canonical mapping table and exposes it through two functions and a class.

apply_clafrica(text)

Converts all Clafrica shortcuts in text to Unicode, preserving whitespace.

from nuficlean import apply_clafrica

apply_clafrica("af1 e2 n*")   # → "ɑ̀ é ŋ"
apply_clafrica("eu3 af5")     # → "ə̄ ɑ̂"
apply_clafrica("uu1 o*2")     # → "ʉ̀ ɔ́"
apply_clafrica("N* O*")       # → "Ŋ Ɔ"

Live-typing mode — pass preserve_ambiguous_trailing=True to leave the last token untouched while the user may still extend it:

apply_clafrica("af", preserve_ambiguous_trailing=True)  # → "af"  (could become af1, af2…)
apply_clafrica("af1")                                   # → "ɑ̀"

finalize_clafrica(text)

Like apply_clafrica but also resolves any trailing ambiguous shortcut — use this when the user confirms input (e.g. presses Enter or Space).

from nuficlean import finalize_clafrica

finalize_clafrica("eu3")   # → "ə̄"
finalize_clafrica("af1")   # → "ɑ̀"
finalize_clafrica("n*")    # → "ŋ"

ClafricaEngine — advanced use

Instantiate the engine directly when you need a custom mapping or extra entries.

from nuficlean import ClafricaEngine

# Add project-specific shortcuts on top of the default table
engine = ClafricaEngine(extra={"nkap": "ŋkɑ̄p"})
engine.apply_mapping("nkap e2")   # → "ŋkɑ̄p é"
engine.finalize_input("eu3")      # → "ə̄"
engine.lookup("af1")              # → "ɑ̀"
engine.lookup("xyz")              # → None

# Fully custom table (replaces the default)
engine = ClafricaEngine(mapping={"a1": "à", "e1": "è"})

Shortcut reference

Shortcut Output Notes
af ɑ open-a
eu ə schwa
ai ε epsilon
o* ɔ open-o
uu ʉ u-bar
n* ŋ eng
N* Ŋ Eng (uppercase)
a1 a2 a3 à á ā low / mid / high tone
af1 af2 af3 ɑ̀ ɑ́ ɑ̄ open-a tones
eu1 eu2 eu3 ə̀ ə́ ə̄ schwa tones
o*1 o*2 o*3 ɔ̀ ɔ́ ɔ̄ open-o tones

Tone digits: 1 = low, 2 = mid, 3 = high, 5 = rising, 7 = falling.

Tip: The clafrica package on PyPI provides the same keyboard mapping as a standalone library if you don't need the Bana normalisation. pip install clafrica


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nuficlean-0.3.2.tar.gz (17.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nuficlean-0.3.2-py3-none-any.whl (16.8 kB view details)

Uploaded Python 3

File details

Details for the file nuficlean-0.3.2.tar.gz.

File metadata

  • Download URL: nuficlean-0.3.2.tar.gz
  • Upload date:
  • Size: 17.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for nuficlean-0.3.2.tar.gz
Algorithm Hash digest
SHA256 b533781e227420101591447c4bf81735521402b1eb9a2ca2efe6389be5d87a32
MD5 33314de92a3118fd1e03144b4912cbc8
BLAKE2b-256 abc780be76cb646ab79407008c093467e8270f1f2f8eb451a816db4e95321f7e

See more details on using hashes here.

File details

Details for the file nuficlean-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: nuficlean-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 16.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for nuficlean-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 13c7d04c10875a10fc1bfcc6f33f6a67dc2ba1b2303e6580e4c72b3c97b4a8e2
MD5 ba2d5ff39458d59851f4bfc0b8fdeac5
BLAKE2b-256 6490c36435f874ce31410477a4a18807f534d32b338443198c1ff2b10ddeff7a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page