Asmai: (Al'asma'i) Arabic semantic analysis library for Python
Project description
Asmai: (Al’asma’i) Arabic semantic analysis
مكتبة الأصمعي الدلالية
Asmai: (Al’asma’i) Arabic semantic analysis library for Python
asmai logo
</figcaption>PyPI - Downloads
</figcaption>Developpers: Taha Zerrouki: http://tahadz.com taha dot zerrouki at gmail dot com
Features |
value |
---|---|
Authors |
|
Release |
0.1 |
License |
|
Tracker |
|
Source |
|
Feedbacks |
|
Accounts |
[@Twitter](https://twitter.com/linuxscout) |
Description
Asmai: (Al’asma’i) Arabic semantic analysis library for Python
مزايا:
استخلاص ثنائيات الكلمات التي تحمل دلالات من نوع : (فاعلية، مفعولية، إضافة)
install
pip install asmai
Usage
import
pip install asmai
Test
import asmai.anasem as asm
text = u"يعبد الله منذ أن تطلع الشمس"
result = []
anasem = asm.SemanticAnalyzer()
result = anasem.analyze_text(text)
# the result contains objets
anasem.pprint(result)
Extract semantic relation, display only found relations
>>> import pprint
>>> sem_result = anasem.display_sem(result)
>>> pprint.pprint(sem_result)
[[['الشَّمْسُ', 'تَطْلُعَ', 'شَمْسٌ', 'طَلَعَ', 'Subject'],
['الشَّمْسُ', 'تَطْلُعُ', 'شَمْسٌ', 'طَلَعَ', 'Subject'],
['الشَّمْسُ', 'تَطْلُعْ', 'شَمْسٌ', 'طَلَعَ', 'Subject'],
['الشَّمْسُ', 'تَطْلَعَ', 'شَمْسٌ', 'طَلَعَ', 'Subject'],
['الشَّمْسُ', 'تَطْلَعُ', 'شَمْسٌ', 'طَلَعَ', 'Subject'],
['الشَّمْسُ', 'تَطْلَعْ', 'شَمْسٌ', 'طَلَعَ', 'Subject']]]
Extract semantic relation, display all words and tags
>>> sem_result = anasem.display_sem(result, all=True) >>> pprint.pprint(sem_result) [('يعبد', 'O', []), ('الله', 'O', []), ('منذ', 'O', []), ('أن', 'O', []), ('تطلع', 'B', []), ('الشمس', 'I', [['الشَّمْسُ', 'تَطْلُعَ', 'شَمْسٌ', 'طَلَعَ', 'Subject'], ['الشَّمْسُ', 'تَطْلُعُ', 'شَمْسٌ', 'طَلَعَ', 'Subject'], ['الشَّمْسُ', 'تَطْلُعْ', 'شَمْسٌ', 'طَلَعَ', 'Subject'], ['الشَّمْسُ', 'تَطْلَعَ', 'شَمْسٌ', 'طَلَعَ', 'Subject'], ['الشَّمْسُ', 'تَطْلَعُ', 'شَمْسٌ', 'طَلَعَ', 'Subject'], ['الشَّمْسُ', 'تَطْلَعْ', 'شَمْسٌ', 'طَلَعَ', 'Subject']])] >>>
convert to pandas ```python >>> import pandas as pd >>> >>> # flatten the result … df = pd.DataFrame(anasem.decode(result)) >>> print(df.head()) action affix affix_key forced_word_case … unvocalized unvoriginal vocalized word 0 -ي– -ي–|المضارع المنصوب:هو:y False … يعبد عبد يُعَبِّدَ يعبد 1 -ي– -ي–|المضارع المجهول المجزوم:هو:y False … يعبد عبد يُعَبَّدْ يعبد 2 -ي– -ي–|المضارع المجهول:هو:y False … يعبد عبد يُعَبَّدُ يعبد 3 -ي– -ي–|المضارع المعلوم:هو:y False … يعبد عبد يُعَبِّدُ يعبد 4 -ي– -ي–|المضارع المجزوم:هو:y False … يعبد عبد يُعَبِّدْ يعبد
[5 rows x 50 columns] >>> df.to_csv(“output/test.csv”, encoding=”utf8”, sep=”:raw-latex:’t’”)
[requirement]
1- pyarabic 2. sqlite 3. sylajone
Data Structure:
Semantic database
CREATE TABLE sqlite_sequence(name,seq);
CREATE TABLE "derivations" (
"id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL UNIQUE ,
"verb" varchar NOT NULL ,
"transitive" BOOL NOT NULL DEFAULT 1,
"derived" VARCHAR NOT NULL ,
"type" VARCHAR NOT NULL
);
CSV Structure:
Derivattion
id : id unique in the database
verb : vocalized collocation
transtive : if the verb is transitive
derived : derived word from verb number
type : type
Semantic relations
CREATE TABLE "relations" (
"id" INTEGER PRIMARY KEY NOT NULL ,
first" VARCHAR NOT NULL DEFAULT ('') ,
"second" VARCHAR NOT NULL DEFAULT ('') ,
"rule" VARCHAR NOT NULL DEFAULT (0)
);
CSV Structure:
id : id unique in the database
first: first word
second: second word
rule : the extraction rule number :
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.