Asmai: (Al'asma'i) Arabic semantic analysis library for Python
Project description
Asmai: (Al’asma’i) Arabic semantic analysis
مكتبة الأصمعي الدلالية
Asmai: (Al’asma’i) Arabic semantic analysis library for Python
asmai logo
PyPI - Downloads
Developpers: Taha Zerrouki: http://tahadz.com taha dot zerrouki at gmail dot com
Features | value |
---|---|
Authors | Authors.md |
Release | 0.1 |
License | GPL |
Tracker | linuxscout/asmai/Issues |
Source | Github |
Feedbacks | Comments |
Accounts | [@Twitter](https://twitter.com/linuxscout) |
Description
Asmai: (Al’asma’i) Arabic semantic analysis library for Python
مزايا:
- استخلاص ثنائيات الكلمات التي تحمل دلالات من نوع : (فاعلية، مفعولية، إضافة)
install
pip install asmai
Usage
import
pip install asmai
Test
import asmai.anasem as asm text = u"يعبد الله منذ أن تطلع الشمس" result = [] anasem = asm.SemanticAnalyzer() result = anasem.analyze_text(text) # the result contains objets anasem.pprint(result)
- Extract semantic relation, display only found relations
>>> import pprint >>> sem_result = anasem.display_sem(result) >>> pprint.pprint(sem_result) [[['الشَّمْسُ', 'تَطْلُعَ', 'شَمْسٌ', 'طَلَعَ', 'Subject'], ['الشَّمْسُ', 'تَطْلُعُ', 'شَمْسٌ', 'طَلَعَ', 'Subject'], ['الشَّمْسُ', 'تَطْلُعْ', 'شَمْسٌ', 'طَلَعَ', 'Subject'], ['الشَّمْسُ', 'تَطْلَعَ', 'شَمْسٌ', 'طَلَعَ', 'Subject'], ['الشَّمْسُ', 'تَطْلَعُ', 'شَمْسٌ', 'طَلَعَ', 'Subject'], ['الشَّمْسُ', 'تَطْلَعْ', 'شَمْسٌ', 'طَلَعَ', 'Subject']]]
Extract semantic relation, display all words and tags
>>> sem_result = anasem.display_sem(result, all=True) >>> pprint.pprint(sem_result) [('يعبد', 'O', []), ('الله', 'O', []), ('منذ', 'O', []), ('أن', 'O', []), ('تطلع', 'B', []), ('الشمس', 'I', [['الشَّمْسُ', 'تَطْلُعَ', 'شَمْسٌ', 'طَلَعَ', 'Subject'], ['الشَّمْسُ', 'تَطْلُعُ', 'شَمْسٌ', 'طَلَعَ', 'Subject'], ['الشَّمْسُ', 'تَطْلُعْ', 'شَمْسٌ', 'طَلَعَ', 'Subject'], ['الشَّمْسُ', 'تَطْلَعَ', 'شَمْسٌ', 'طَلَعَ', 'Subject'], ['الشَّمْسُ', 'تَطْلَعُ', 'شَمْسٌ', 'طَلَعَ', 'Subject'], ['الشَّمْسُ', 'تَطْلَعْ', 'شَمْسٌ', 'طَلَعَ', 'Subject']])] >>>
convert to pandas ```python >>> import pandas as pd >>> >>> # flatten the result … df = pd.DataFrame(anasem.decode(result)) >>> print(df.head()) action affix affix_key forced_word_case … unvocalized unvoriginal vocalized word 0 -ي– -ي–|المضارع المنصوب:هو:y False … يعبد عبد يُعَبِّدَ يعبد 1 -ي– -ي–|المضارع المجهول المجزوم:هو:y False … يعبد عبد يُعَبَّدْ يعبد 2 -ي– -ي–|المضارع المجهول:هو:y False … يعبد عبد يُعَبَّدُ يعبد 3 -ي– -ي–|المضارع المعلوم:هو:y False … يعبد عبد يُعَبِّدُ يعبد 4 -ي– -ي–|المضارع المجزوم:هو:y False … يعبد عبد يُعَبِّدْ يعبد
[5 rows x 50 columns] >>> df.to_csv(“output/test.csv”, encoding=”utf8”, sep=”:raw-latex:’t’”)
[requirement]
1- pyarabic 2. sqlite 3. sylajone
Data Structure:
Semantic database
CREATE TABLE sqlite_sequence(name,seq); CREATE TABLE "derivations" ( "id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL UNIQUE , "verb" varchar NOT NULL , "transitive" BOOL NOT NULL DEFAULT 1, "derived" VARCHAR NOT NULL , "type" VARCHAR NOT NULL );
CSV Structure:
- Derivattion
- id : id unique in the database
- verb : vocalized collocation
- transtive : if the verb is transitive
- derived : derived word from verb number
- type : type
Semantic relations
CREATE TABLE "relations" ( "id" INTEGER PRIMARY KEY NOT NULL , first" VARCHAR NOT NULL DEFAULT ('') , "second" VARCHAR NOT NULL DEFAULT ('') , "rule" VARCHAR NOT NULL DEFAULT (0) );
CSV Structure:
- id : id unique in the database
- first: first word
- second: second word
- rule : the extraction rule number :
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.