quarnic nlp
Project description
QuaranicTools: A Python NLP Library for Quranic NLP
Language Processing and Digital Humanities Lab (Language.ML)
Part of Speech Tagging
|
Dependency Parsing
|
Lemmatizer
|
Multilingual Search
|
Quranic Extractions
|
Revelation Order
|
Embeddings (coming soon)
|
Translations
Quranic NLP
Quranic NLP is a computational toolbox to conduct various syntactic and semantic analyses of Quranic verses. The aim is to put together all available resources contributing to a better understanding/analysis of the Quran for everyone.
Contents:
Installation
To get started using Quranic NLP in your python project, you may simply install it via the pip package.
Install with pip
pip install quranic-nlp
You can check the requirements.txt
file to see the required packages.
Pipeline
The NLP pipeline contains morphological information e.g., Lemmatizer as well as POS Tagger and Dependancy Parser in a Spacy
-like pipeline.
from quranic_nlp import language
translation_translator = 'fa#1'
pips = 'dep,pos,root,lemma'
nlp = language.Pipeline(pips, translation_translator)
Doc
object has different extensions.
First, there are sentences
in doc
referring to the verses.
Second, there are ayah
in doc
which is indicate number ayeh in soure.
Third, there are surah
in doc
which is indicate name of soure.
Fourth, there are revelation_order
in doc
which is indicate order of revelation of the ayeh.
doc
which is the list of Token
also has its own extensions.
The pips is information to use from quranic_nlp.
The translation_translator is language for translate quran such that language (fa) or language along with # along with number books.
For see all translate run below code
from quranic_nlp import utils
utils.print_all_translations()
Quranic NLP has its own spacy extensions. If related pipeline is not called, that extension cannot be used.
Format Inputs
There are three ways to format the input. First, number surah along with # along with number ayah. Second, name surah along with # along with number ayah. Third, search text in quran.
Note The last two calls require access to the net for an API call.
from quranic_nlp import language
translation_translator = 'fa#1'
pips = 'dep,pos,root,lemma'
nlp = language.Pipeline(pips, translation_translator)
doc = nlp('1#1')
doc = nlp('حمد#1')
doc = nlp('رب العالمین')
Example
from quranic_nlp import language
translation_translator = 'fa#1'
pips = 'dep,pos,root,lemma'
nlp = language.Pipeline(pips, translation_translator)
doc = nlp('1#4')
print(doc)
print(doc._.text)
print(doc._.surah)
print(doc._.ayah)
print(doc._.revelation_order)
print(doc._.sim_ayahs)
print(doc._.translations)
إِيَّاكَ نَعْبُدُ وَ إِيَّاكَ نَسْتَعِينُ نحن نحن
إِيَّاكَ نَعْبُدُ وَ إِيَّاكَ نَسْتَعِينُ
فاتحه
4
63
['82#15', '83#11', '70#26', '51#12', '56#56', '82#17', '74#46', '37#20', '82#18', '15#35', '38#78', '26#82', '109#6', '51#6', '82#9', '107#1', '95#7', '40#16', '19#15', '19#33', '61#9', '9#33', '48#28', '21#103', '6#73', '3#26', '98#5', '83#5', '39#11', '40#14', '77#12', '50#42', '77#35', '77#13', '39#2', '36#71', '74#9', '85#2', '16#52', '30#30', '42#13', '75#1', '30#43', '75#6', '40#29', '39#14', '43#77', '5#3', '86#9', '26#189', '40#65', '26#87', '38#81', '15#38', '7#51', '23#113', '23#16', '79#6', '51#13', '77#14', '37#26', '9#11', '3#24', '114#2', '82#19', '11#103', '34#40', '26#135', '25#25', '70#8', '2#193', '9#29', '19#38', '2#132', '7#14', '29#65', '8#39', '64#9', '30#14', '45#27', '10#105', '110#2', '78#17', '79#35', '83#6', '77#38', '50#34', '38#79', '15#36', '37#21', '44#40', '52#9', '56#50', '90#14', '40#32', '9#36', '80#34', '26#88', '56#86', '50#20']
تنها تو را مى پرستيم و تنها از تو يارى مىجوييم.
print(doc[1])
print(doc[1].head)
print(doc[1].dep_)
print(doc[1]._.dep_arc)
print(doc[1]._.root)
print(doc[1].lemma_)
print(doc[1].pos_)
نَعْبُدُ
وَ
معطوف بر محل
LTR
عبد
VERB
To jsonify the results you can use the following:
dictionary = language.to_json(pips, doc)
print(dictionary)
[{'id': 1, 'text': الْ, 'root': None, 'lemma': '', 'pos': 'INTJ', 'rel': 'تعریف', 'arc': 'RTL', 'head': حَمْدُ}, {'id': 2, 'text': حَمْدُ, 'root': 'حمد', 'lemma': '', 'pos': 'NOUN', 'rel': 'خبر', 'arc': 'LTR', 'head': *}, {'id': 3, 'text': لِ, 'root': None, 'lemma': '', 'pos': 'INTJ', 'rel': 'متعلق', 'arc': 'LTR', 'head': *}, {'id': 4, 'text': لَّهِ, 'root': 'أله', 'lemma': '', 'pos': 'NOUN', 'rel': 'نعت', 'arc': 'LTR', 'head': رَبِّ}, {'id': 5, 'text': رَبِّ, 'root': 'ربب', 'lemma': '', 'pos': 'NOUN', 'rel': 'مضاف الیه ', 'arc': 'LTR', 'head': عَالَمِینَ}, {'id': 6, 'text': الْ, 'root': None, 'lemma': '', 'pos': 'INTJ', 'rel': 'تعریف', 'arc': 'RTL', 'head': عَالَمِینَ}, {'id': 7, 'text': عَالَمِینَ, 'root': 'علم', 'lemma': '', 'pos': 'NOUN', 'rel': '', 'arc': None, 'head': عَالَمِینَ}, {'id': 8, 'text': *, 'root': None, 'lemma': '', 'pos': '', 'rel': '', 'arc': None, 'head': *}]
To jsonify the results you can use the following:
from spacy import displacy
from quranic_nlp import language
from quranic_nlp import utils
translation_translator = 'fa#1'
pips = 'dep,pos,root,lemma'
nlp = language.Pipeline(pips, translation_translator)
doc = nlp('1#4')
displacy.serve(doc, style="dep")
options = {"compact": True, "bg": "#09a3d5",
"color": "white", "font": "xb-niloofar"}
displacy.serve(doc, style="dep", options=options)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file quranic_nlp-1.1.8.tar.gz
.
File metadata
- Download URL: quranic_nlp-1.1.8.tar.gz
- Upload date:
- Size: 68.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
28993fb151809d128ccd5ccaa41bae602d4d46b1b8ca48fed0acd3d835f79b58
|
|
MD5 |
a3cb1b4f2642d18ac15b7855fd87f2a5
|
|
BLAKE2b-256 |
085ea12d20587ad5b7f039ded5d9489e4be76d6de68c53d365c07260b6e4a732
|
File details
Details for the file quranic_nlp-1.1.8-py3-none-any.whl
.
File metadata
- Download URL: quranic_nlp-1.1.8-py3-none-any.whl
- Upload date:
- Size: 85.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
b98931448d77530951467a8577fc2a05869f030a1ad5c8c76ffc3eae8e325c43
|
|
MD5 |
b8d83860e3b7fb5a6480c748233ac734
|
|
BLAKE2b-256 |
6783fa51090dd2cc75de042d380562deb622833264047084f68eda8514b2424b
|