Quran Phonetic Script with addional quarnic utils
Project description
Quran Muaalem
📖 رابط لتجربة المعلم القرآني
يرجى الضغط على للتجربة:
الرابط⚠️ تنبيه: هذا الرابط سينتهي في 27 أغسطس 2025
الممزيات
- مدرب على الرسم الصوتي للقرآن الكريم: quran-transcript القادر على كشف أخطاء الحروف والتجويد وصفات الحروف
- نموذج معقول الحجم 660 MP
- يحتاج فقط إله 1.5 GB من ذاكرة معالج الرسوميات
- معمارية مبتكرة: CTC متعدد المستويات
المعمارية
معمارية مبتكرة: CTC متعدد المستويات. حيث كل مستوي يتدرب على وجه معين
الخطوات المختصرة للتطوير
- تجميع التلاوت القرآنية من القراء المتقنين: prepare-quran-dataset
- تقسيم التلاوت على حسب الوقف وليس الآية باستخدام المقسم
- الحصو على النص القرآني من المقاطع الصوتية باسخدام نموذج ترتيل
- تصحيح النصوص المستخرجة من ترتيل باستخدام خوارزمية التسميع
- تحويل الرسم الإملائي للرسم العثماني: quran-transcript
- تحويل الرسم العثماني للرسم الصوتي للقرآني الكريم الذي يصف كل قواعد التجويد ما عدا الإشمام: quran-transcript
- تدريب النموذج على معمارية Wav2Vec2BERT
استخدام النوذج
استخدام النموذج عن طريق واجهة gradio
قم بتزيل uv
pip install uv
أو
curl -LsSf https://astral.sh/uv/install.sh | sh
بعد ذلك قم بتنزيل ffmpeg
sudo apt-get update
sudo apt-get install -y ffmpeg
أو من خلال anaconda
conda install ffmpeg
قم بتشغيل gradio ب command واحد فقط:
uvx --no-cache --from https://github.com/obadx/quran-muaalem.git[ui] quran-muaalem-ui
او
uvx quran-muaalem[ui] quran-muaalem-ui
عن طريق python API
Installation
First, install the required dependencies:
# Install system dependencies
sudo apt-get install -y ffmpeg libsndfile1 portaudio19-dev
# Install Python packages
pip install quran-muaalem librosa "numba>=0.61.2"
Basic Usage Example
"""
Basic example of using the Quran Muaalem package for phonetic analysis of Quranic recitation.
"""
from dataclasses import asdict
import json
import logging
from quran_transcript import Aya, quran_phonetizer, MoshafAttributes
import torch
from librosa.core import load
# Import the main Muaalem class (adjust import based on your actual package structure)
from quran_muaalem import Muaalem
# Setup logging to see informative messages
logging.basicConfig(level=logging.INFO)
def analyze_recitation(audio_path):
"""
Analyze a Quranic recitation audio file using the Muaalem model.
Args:
audio_path (str): Path to the audio file to analyze
"""
# Configuration
sampling_rate = 16000 # Must be 16000 Hz
device = "cuda" if torch.cuda.is_available() else "cpu" # Use GPU if available
# Step 1: Prepare the Quranic reference text
# Get the Uthmani script for a specific verse (Aya 8, Surah 75 in this example)
uthmani_ref = Aya(8, 75).get_by_imlaey_words(17, 9).uthmani
# Step 2: Configure the recitation style (Moshaf attributes)
moshaf = MoshafAttributes(
rewaya="hafs", # Recitation style (Hafs is most common)
madd_monfasel_len=2, # Length of separated elongation
madd_mottasel_len=4, # Length of connected elongation
madd_mottasel_waqf=4, # Length of connected elongation when stopping
madd_aared_len=2, # Length of necessary elongation
)
# see: https://github.com/obadx/prepare-quran-dataset?tab=readme-ov-file#moshaf-attributes-docs
# Step 3: Convert text to phonetic representation
# see docs for phnetizer: https://github.com/obadx/quran-transcript
phonetizer_out = quran_phonetizer(uthmani_ref, moshaf, remove_spaces=True)
# Step 4: Initialize the Muaalem model
muaalem = Muaalem(device=device)
# Step 5: Load and prepare the audio
wave, _ = load(audio_path, sr=sampling_rate, mono=True)
# Step 6: Process the audio with the model
# The model analyzes the phonetic properties of the recitation
outs = muaalem(
[wave], # Audio data
[phonetizer_out], # Phonetic reference
sampling_rate=sampling_rate
)
# Step 7: Display the results
for out in outs:
print("Predicted Phonemes:", out.phonemes.text)
# Display detailed phonetic features for each phoneme
for sifa in out.sifat:
print(json.dumps(asdict(sifa), indent=2, ensure_ascii=False))
print("*" * 30)
print("-" * 40)
# Explaining Results
explain_for_terminal(
outs[0].phonemes.text,
phonetizer_out.phonemes,
outs[0].sifat,
phonetizer_out.sifat,
)
if __name__ == "__main__":
# Replace with the path to your audio file
audio_path = "./assets/test.wav"
try:
analyze_recitation(audio_path)
except Exception as e:
logging.error(f"Error processing audio: {e}")
Output:
ءِننننَللَااهَبِكُللِشَيءِنعَلِۦۦمُ۾۾۾بَرَااااءَتُممممِنَللَااهِوَرَسُۥۥلِه
┏━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Phonemes ┃ Tafashie ┃ Qalqla ┃ Ghonna ┃ Hams Or Jahr ┃ Safeer ┃ Tikraar ┃ Tafkheem Or Taqeeq ┃ Istitala ┃ Shidda Or Rakhawa ┃ Itbaq ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ ءِ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ shadeed │ monfateh │
│ ننننَ │ not_motafashie │ not_moqalqal │ maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ between │ monfateh │
│ للَ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ mofakham │ not_mostateel │ between │ monfateh │
│ اا │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ mofakham │ not_mostateel │ rikhw │ monfateh │
│ هَ │ not_motafashie │ not_moqalqal │ not_maghnoon │ hams │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ rikhw │ monfateh │
│ بِ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ shadeed │ monfateh │
│ كُ │ not_motafashie │ not_moqalqal │ not_maghnoon │ hams │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ shadeed │ monfateh │
│ للِ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ between │ monfateh │
│ شَ │ motafashie │ not_moqalqal │ not_maghnoon │ hams │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ rikhw │ monfateh │
│ ي │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ rikhw │ monfateh │
│ ءِ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ shadeed │ monfateh │
│ ن │ not_motafashie │ not_moqalqal │ maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ between │ monfateh │
│ عَ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ between │ monfateh │
│ لِ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ between │ monfateh │
│ ۦۦ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ rikhw │ monfateh │
│ مُ │ not_motafashie │ not_moqalqal │ maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ between │ monfateh │
│ ۾۾۾ │ not_motafashie │ not_moqalqal │ maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ rikhw │ monfateh │
│ بَ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ shadeed │ monfateh │
│ رَ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ mokarar │ mofakham │ not_mostateel │ between │ monfateh │
│ اااا │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ mofakham │ not_mostateel │ rikhw │ monfateh │
│ ءَ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ shadeed │ monfateh │
│ تُ │ not_motafashie │ not_moqalqal │ not_maghnoon │ hams │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ shadeed │ monfateh │
│ ممممِ │ not_motafashie │ not_moqalqal │ maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ between │ monfateh │
│ نَ │ not_motafashie │ not_moqalqal │ maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ between │ monfateh │
│ للَ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ mofakham │ not_mostateel │ between │ monfateh │
│ اا │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ mofakham │ not_mostateel │ rikhw │ monfateh │
│ هِ │ not_motafashie │ not_moqalqal │ not_maghnoon │ hams │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ rikhw │ monfateh │
│ وَ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ rikhw │ monfateh │
│ رَ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ mokarar │ mofakham │ not_mostateel │ between │ monfateh │
│ سُ │ not_motafashie │ not_moqalqal │ not_maghnoon │ hams │ safeer │ not_mokarar │ moraqaq │ not_mostateel │ rikhw │ monfateh │
│ ۥۥ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ rikhw │ monfateh │
│ لِ │ not_motafashie │ not_moqalqal │ not_maghnoon │ jahr │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ between │ monfateh │
│ ه │ not_motafashie │ not_moqalqal │ not_maghnoon │ hams │ no_safeer │ not_mokarar │ moraqaq │ not_mostateel │ rikhw │ monfateh │
└──────────┴────────────────┴──────────────┴──────────────┴──────────────┴───────────┴─────────────┴────────────────────┴───────────────┴───────────────────┴──────────┘
API Docs
class Muaalem:
def __init__(
self,
model_name_or_path: str = "obadx/muaalem-model-v3_2",
device: str = "cpu",
dtype=torch.bfloat16,
):
"""
Initializing Muallem Model
Args:
model_name_or_path: the huggingface model name or path
device: the device to run model on
dtype: the torch dtype. Default is `torch.bfloat16` as the model was trained on
"""
@torch.no_grad()
def __call__(
self,
waves: list[list[float] | torch.FloatTensor | NDArray],
ref_quran_phonetic_script_list: list[QuranPhoneticScriptOutput],
sampling_rate: int,
) -> list[MuaalemOutput]:
"""Infrence Funcion for the Quran Muaalem Project
waves: input waves batch , seq_len with different formats described above
ref_quran_phonetic_script_list (list[QuranPhoneticScriptOutput]): list of the
phonetized ouput of `quran_transcript.quran_phonetizer` with `remove_space=True`
sampleing_rate (int): has to be 16000
Returns:
list[MuaalemOutput]:
A list of output objects, each containing phoneme predictions and their
phonetic features (sifat) for a processed input.
Each MuaalemOutput contains:
phonemes (Unit):
A dataclass representing the predicted phoneme sequence with:
text (str): Concatenated string of all phonemes.
probs (Union[torch.FloatTensor, list[float]]):
Confidence probabilities for each predicted phoneme.
ids (Union[torch.LongTensor, list[int]]):
Token IDs corresponding to each phoneme.
sifat (list[Sifa]):
A list of phonetic feature dataclasses (one per phoneme) with the
following optional properties (each is a SingleUnit or None):
- phonemes_group (str): the phonemes associated with the `sifa`
- hams_or_jahr (SingleUnit): either `hams` or `jahr`
- shidda_or_rakhawa (SingleUnit): either `shadeed`, `between`, or `rikhw`
- tafkheem_or_taqeeq (SingleUnit): either `mofakham`, `moraqaq`, or `low_mofakham`
- itbaq (SingleUnit): either `monfateh`, or `motbaq`
- safeer (SingleUnit): either `safeer`, or `no_safeer`
- qalqla (SingleUnit): eithr `moqalqal`, or `not_moqalqal`
- tikraar (SingleUnit): either `mokarar` or `not_mokarar`
- tafashie (SingleUnit): either `motafashie`, or `not_motafashie`
- istitala (SingleUnit): either `mostateel`, or `not_mostateel`
- ghonna (SingleUnit): either `maghnoon`, or `not_maghnoon`
Each SingleUnit in Sifa properties contains:
text (str): The feature's categorical label (e.g., "hams", "shidda").
prob (float): Confidence probability for this feature.
idx (int): Identifier for the feature class.
"""
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file quran_muaalem-0.0.3.tar.gz.
File metadata
- Download URL: quran_muaalem-0.0.3.tar.gz
- Upload date:
- Size: 38.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
54d58b83a64c5e0df613972d08e809399aa1b5e6fd88620b4c855411908d25f5
|
|
| MD5 |
da3c452cd5e297b3ed888171c926ddc4
|
|
| BLAKE2b-256 |
13041ccbc2008d2f62f4906bd34ed8e41fdf1474c727668c6eea30563c54a849
|
File details
Details for the file quran_muaalem-0.0.3-py3-none-any.whl.
File metadata
- Download URL: quran_muaalem-0.0.3-py3-none-any.whl
- Upload date:
- Size: 32.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ac2ef286d7be432c60966fd195b4315a11ef08fcd0f0bf1c7ca9c2fd96d354b
|
|
| MD5 |
d84e1c519002558c02dc45192a6a71c2
|
|
| BLAKE2b-256 |
18d2dd2d492c5dc3f1a335e6dc11e177d206d60faf67e18e32624ef0c9a26b35
|