Skip to main content

Python library for Nufi (Fe'éfě'e) text: Clafrica keyboard mapping, Bana→Komako normalisation, low-tone stripping, and encoding repair

Project description

nuficlean

Python library for Nufi (Fe'éfě'e / Babanki-Tungo) text utilities:

  • Bana → Komako normalisation — converts Bana orthography to the standard Komako form, strips low-tone diacritics, and repairs encoding issues
  • Clafrica keyboard mapping — converts ASCII shortcut sequences (Clafrica input method) into the corresponding Nufi Unicode characters

Install

pip install nuficlean

Bana normalisation

clean(text)

Applies the full normalisation pipeline to a string.

from nuficlean import clean

clean("kòlə̀'")        # → "kwele'"
clean("nàh")           # → "lah"
clean("mɛ̀ɛ̀")         # → "maa"
clean("tōh mēndɑ̀'")  # → "tōh mēndɑ'"

clean_lines(lines)

Cleans a list of strings.

from nuficlean import clean_lines

clean_lines(["kòlə̀'", "nàh", "mɛ̀ɛ̀"])
# → ["kwele'", "lah", "maa"]

Pipeline

  1. Mojibake repair — fixes Latin-1 → UTF-8 misencoding
  2. Apostrophe / quote unification — maps ', `, ʼ, ", «, » → ASCII
  3. Bana → Komako rewrite — longest-match substitution (kòlə̀'kwele', ɛ̀a, …)
  4. Low-tone stripping — removes grave-accent tone marks (àa, ɑ̀ɑ, …)
  5. NFC recomposition

CLI

nuficlean "kòlə̀'"
echo "mɛ̀ɛ̀" | nuficlean

Clafrica keyboard mapping

The Clafrica input method uses ASCII shortcut sequences to type Nufi characters. nuficlean ships the canonical mapping table and exposes it through two functions and a class.

apply_clafrica(text)

Converts all Clafrica shortcuts in text to Unicode, preserving whitespace.

from nuficlean import apply_clafrica

apply_clafrica("af1 e2 n*")   # → "ɑ̀ é ŋ"
apply_clafrica("eu3 af5")     # → "ə̄ ɑ̂"
apply_clafrica("uu1 o*2")     # → "ʉ̀ ɔ́"
apply_clafrica("N* O*")       # → "Ŋ Ɔ"

Live-typing mode — pass preserve_ambiguous_trailing=True to leave the last token untouched while the user may still extend it:

apply_clafrica("af", preserve_ambiguous_trailing=True)  # → "af"  (could become af1, af2…)
apply_clafrica("af1")                                   # → "ɑ̀"

finalize_clafrica(text)

Like apply_clafrica but also resolves any trailing ambiguous shortcut — use this when the user confirms input (e.g. presses Enter or Space).

from nuficlean import finalize_clafrica

finalize_clafrica("eu3")   # → "ə̄"
finalize_clafrica("af1")   # → "ɑ̀"
finalize_clafrica("n*")    # → "ŋ"

ClafricaEngine — advanced use

Instantiate the engine directly when you need a custom mapping or extra entries.

from nuficlean import ClafricaEngine

# Add project-specific shortcuts on top of the default table
engine = ClafricaEngine(extra={"nkap": "ŋkɑ̄p"})
engine.apply_mapping("nkap e2")   # → "ŋkɑ̄p é"
engine.finalize_input("eu3")      # → "ə̄"
engine.lookup("af1")              # → "ɑ̀"
engine.lookup("xyz")              # → None

# Fully custom table (replaces the default)
engine = ClafricaEngine(mapping={"a1": "à", "e1": "è"})

Shortcut reference

Shortcut Output Notes
af ɑ open-a
eu ə schwa
ai ε epsilon
o* ɔ open-o
uu ʉ u-bar
n* ŋ eng
N* Ŋ Eng (uppercase)
a1 a2 a3 à á ā low / mid / high tone
af1 af2 af3 ɑ̀ ɑ́ ɑ̄ open-a tones
eu1 eu2 eu3 ə̀ ə́ ə̄ schwa tones
o*1 o*2 o*3 ɔ̀ ɔ́ ɔ̄ open-o tones

Tone digits: 1 = low, 2 = mid, 3 = high, 5 = rising, 7 = falling.

Tip: The clafrica package on PyPI provides the same keyboard mapping as a standalone library if you don't need the Bana normalisation. pip install clafrica


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nuficlean-0.3.0.tar.gz (16.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nuficlean-0.3.0-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file nuficlean-0.3.0.tar.gz.

File metadata

  • Download URL: nuficlean-0.3.0.tar.gz
  • Upload date:
  • Size: 16.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for nuficlean-0.3.0.tar.gz
Algorithm Hash digest
SHA256 e9e02044a2852dad9538cc4944633adbc0cdc6d53d81e1222bd95c8467111341
MD5 e8a8662f11109ebd7b9f8cb9c99d8b6b
BLAKE2b-256 b808b99b1a24e863dfc051191469e822a0e5344be3847f2fee537ed6343baa39

See more details on using hashes here.

File details

Details for the file nuficlean-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: nuficlean-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 16.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for nuficlean-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c488ae9c569a3226411716383825b31977e41b6ddbb888a2e6698f5b79db0638
MD5 13eba4d4984903066f6210d6ea6b203d
BLAKE2b-256 6665927187a76ccb6d3ee2805cff300e57b76c21f614acfe8d34537c9b52ebaf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page