Arabic shaping, BiDi, and un-baking for games, TTS, and real-time clients.
Project description
arabic-rt
Arabic shaping, BiDi, and un-baking for games, TTS, and real-time clients.
🤗 Try the live demo · 📦 PyPI · 🎮 .NET / Unity version
Most Arabic libraries can turn logical Arabic into correctly shaped, right-to-left text. arabic-rt does that too — but it also does the part almost nothing else does: it can reverse the process, turning baked presentation-form text back into clean logical Arabic. That round-trip is what makes Arabic work in places it normally breaks: multiplayer game chat, naive text renderers, and text-to-speech.
- 🔁 Bake and un-bake.
fix()→ renders correctly even on clients that do zero Arabic processing.unfix()→ recovers logical Arabic for TTS, search, or logging. - 🎮 Built for real-time clients. A
GAMEpreset handles word-by-word chat readers (joins words so they aren't split, keeps the first words on top when wrapping). - 🧩 Zero dependencies. Pure Python. Drop it in anywhere.
- ✅ Validated. Forward output matches
arabic_reshaper+python-bidibyte-for-byte;unfix(fix(x)) == xis covered by tests. The C# port produces byte-identical output, so text baked in Unity reads back in Python and vice-versa.
Pure shaping/BiDi is well served by existing tools.
arabic-rt's reason to exist is the real-time / game niche and the un-baking capability built for it.
Try it
A live, no-install demo — type Arabic and watch it shaped, baked, and un-baked in real time: https://huggingface.co/spaces/balswyan/arabic-rt
Install
pip install arabic-rt
Quick start
import arabic_rt as ar
baked = ar.fix("مرحبا بالعالم") # visual-order presentation forms (renders anywhere)
ar.unfix(baked) # -> "مرحبا بالعالم" (back to logical, for TTS/search)
ar.shape("سلم") # -> "ﺳﻠﻢ" (contextual shaping only, no reorder)
ar.contains_arabic("hi مرحبا") # True
ar.is_shaped(baked) # True
Game chat (word-by-word readers)
ar.fix("مرحبا بالعالم", ar.GAME) # words joined so a naive reader shows the whole phrase
Tune it yourself
from arabic_rt import Options, fix
opts = Options(
combine_allah=True, # collapse الله -> ﷲ
reverse_word_order=True, # full RTL line (False = shape per word, keep typed order)
word_joiner="\u00A0", # separator for naive word-by-word readers
prevent_word_split=True,
max_line_chars=18, # wrap long lines ourselves (first words on top, each line RTL)
)
fix("نص عربي طويل", opts)
Why "un-baking" matters
To make Arabic show up correctly on a client that does no shaping, you "bake" it into final presentation glyphs in visual (reversed) order. The catch: once baked, the text is no longer real Arabic letters — so a text-to-speech engine reads gibberish, and search/logging break. unfix() reverses the bake (presentation forms → base letters, ligatures expanded, order restored) so the display can stay baked while the voice and data see clean Arabic.
API
| Function | Purpose |
|---|---|
fix(text, opts=None, **overrides) |
Logical Arabic → baked visual presentation forms. No-op on non-Arabic or already-shaped text. |
unfix(text) |
Baked Arabic → logical Arabic. No-op on text that isn't baked. |
shape(text, *, combine_allah=False) |
Contextual shaping only; order preserved. |
contains_arabic(text) / is_shaped(text) |
Fast checks. |
Options / GAME |
Config dataclass and a ready preset for game chat. |
Also available for .NET & Unity
The same engine, ported to C# with byte-for-byte identical output, targeting netstandard2.0/2.1 (Unity-compatible):
github.com/balswyan/arabic-rt-dotnet · dotnet add package ArabicRt
A note on display fonts
arabic-rt produces correct text; how it looks is your font's job. For rendering shaped Arabic (e.g. in the demo or a UI), a quality Naskh face such as Noto Naskh Arabic or Amiri (both SIL OFL) looks far better than a generic system font.
Validation
Run the suite (installs the reference libraries as dev extras):
pip install -e ".[dev]"
pytest -q
License & author
Licensed under the Mozilla Public License 2.0 (MPL-2.0) — see LICENSE. Use it freely, including in closed-source games and apps; modifications to arabic-rt's own files stay open.
Created by Bandar AlSwyan.
عربي — نظرة سريعة
arabic-rt مكتبة لمعالجة النص العربي: تشكيل الحروف (وصلها بأشكالها الصحيحة)، وترتيبها من اليمين إلى اليسار، والأهم القدرة على عكس العملية — أي تحويل النص «المخبوز» (أشكال العرض المقلوبة) مرة أخرى إلى عربية منطقية سليمة.
هذه القدرة على «فك الخبز» (unfix) هي ما يجعل العربية تعمل في أماكن تتعطّل فيها عادةً: دردشة الألعاب الجماعية، والمحرّكات التي لا تعالج العربية، وأنظمة النطق (TTS). فالنص يظهر صحيحاً للجميع، بينما يقرأ محرّك الصوت أو البحث نسخة منطقية نظيفة.
🤗 جرّب العرض الحيّ: huggingface.co/spaces/balswyan/arabic-rt
- fix(): عربية منطقية ← أشكال عرض جاهزة تظهر بشكل صحيح على أي عميل حتى بدون معالجة.
- unfix(): عكس العملية لاستعادة العربية المنطقية (للنطق والبحث والسجلات).
- GAME: إعداد جاهز لدردشة الألعاب التي تقرأ الكلمات واحدة تلو الأخرى.
- بدون أي اعتماديات، ومُتحقَّق منها مقابل arabic_reshaper و python-bidi حرفاً بحرف.
متوفّرة أيضاً لـ .NET و Unity: arabic-rt-dotnet. برخصة MPL-2.0. من إعداد بندر الصويان.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file arabic_rt-0.1.4.tar.gz.
File metadata
- Download URL: arabic_rt-0.1.4.tar.gz
- Upload date:
- Size: 17.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c47dbbb7725bf6026518734570459097fdb0eabdd1a2b7f6e59eb2866b5d2662
|
|
| MD5 |
603b9a6d116f911cbb3dffcc5e1b46a2
|
|
| BLAKE2b-256 |
43502f0b9eaae31a97ac3af161932e110b6bf8928231d62314e7f80a34a7977e
|
Provenance
The following attestation bundles were made for arabic_rt-0.1.4.tar.gz:
Publisher:
publish.yml on balswyan/arabic-rt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
arabic_rt-0.1.4.tar.gz -
Subject digest:
c47dbbb7725bf6026518734570459097fdb0eabdd1a2b7f6e59eb2866b5d2662 - Sigstore transparency entry: 1716488983
- Sigstore integration time:
-
Permalink:
balswyan/arabic-rt@99b7f2130230209e0e69f217d15b627242c8a7da -
Branch / Tag:
refs/tags/v0.1.4 - Owner: https://github.com/balswyan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@99b7f2130230209e0e69f217d15b627242c8a7da -
Trigger Event:
release
-
Statement type:
File details
Details for the file arabic_rt-0.1.4-py3-none-any.whl.
File metadata
- Download URL: arabic_rt-0.1.4-py3-none-any.whl
- Upload date:
- Size: 16.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
81d725b60b570a33795521686fb05a8cfedecbdbbec66021f6c930e246b3e24e
|
|
| MD5 |
3a3b856b8ed2e73749dcc172c5cf810b
|
|
| BLAKE2b-256 |
9d77f53a182917f2521a1a43b6d87af9b176d79ac1362c4c7ba105ed8c7cfea0
|
Provenance
The following attestation bundles were made for arabic_rt-0.1.4-py3-none-any.whl:
Publisher:
publish.yml on balswyan/arabic-rt
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
arabic_rt-0.1.4-py3-none-any.whl -
Subject digest:
81d725b60b570a33795521686fb05a8cfedecbdbbec66021f6c930e246b3e24e - Sigstore transparency entry: 1716489328
- Sigstore integration time:
-
Permalink:
balswyan/arabic-rt@99b7f2130230209e0e69f217d15b627242c8a7da -
Branch / Tag:
refs/tags/v0.1.4 - Owner: https://github.com/balswyan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@99b7f2130230209e0e69f217d15b627242c8a7da -
Trigger Event:
release
-
Statement type: