Skip to main content

Naftawayh: Arabic word tagger

Project description

نفطويه: تصنيف الكلمات العربية

Naftawayh: Arabic Word Tagger

Naftawayh is a python library for Arabic word tagging (word classification) into types (nouns, verbs, stopwords), which is useful in language processing, especially for text mining. Naftawayh works according to the Arabic word structure, and the ability to guess the word class, through certain signs. For example, a word which ends Teh Marbuta, is a noun. Hamza Below Alef, class the word as a noun. We can identify many kins of words, by patterns especially for verbs in present tense and defined words.

نفطويه هو برنامج ومكتبة لتصنيف الكلمات إلى أنواعها (اسم، فعل، حرف)، ويفيد في المعالجة الآلية للغة وخصوصا التنقيب عن المعلومات، ومبدأه يعمل على بنية الكلمة العربية، وقدرتنا على تخمين نوعها، من خلال علامات معينة. فمثلا كل كلمة تنتهي بتاء مربوطة فهي اسم، وكل كلمة فيها همزة تحت الألف اسم. ويمكننا التعرف على كثير من الكلمات المعرّفة بالألف واللام، وبعض أنماط الأفعال المضارعة.

Developpers: Taha Zerrouki: http://tahadz.com taha dot zerrouki at gmail dot com

Feature s

value

Authors

Taha Zerrouki: http://tahadz.com, taha dot zerrouki at gmail dot com

Release

0.3

License

GPL

Tracker

linuxscout/naftawayh/Issues

Website

https://pypi.python.org/pypi/naftawayh

Doc

package Documentaion

Source

Github

Downloa d

pypi.python.org

Feedbac ks

Comments

Account s

[@Twitter](https://twitter.com/linuxscout) [@Sourceforge](http://sourceforge.net/projects/naftawayh/)

Citation

If you would cite it in academic work, can you use this citation

T. Zerrouki‏, Naftawayh,  Arabic Word Tagger,
  https://pypi.python.org/pypi/naftawayh/, 2010

or in bibtex format

@misc{zerrouki2012naftawayh,
  title={Naftawayh : Arabic Word Tagger},
  author={Zerrouki, Taha},
  url={https://pypi.python.org/pypi/naftawayh,
  year={2010}
}

Applications

  • Text mining.

  • Text summarizing.

  • Sentences identification.

  • Grammar analysis.

  • Morphological analysis acceleration.

  • Extraction of ngrams..

### تطبيقات

  • التنقيب عن المعلومات.

  • تلخيص النص.

  • التعرف على الجمل.

  • التحليل النحوي.

  • تسريع التحليل الصرفي.

  • استخراج المصطلحات والمسكوكات والمتلازمات.

من هو نفطويه Who is Naftawayh

Who is Naftawayh?

Who is Naftawayh?

Demo جرّب

يمكن التجربة على موقع مشكال ، اختر أدوات، ثم استخلاص ثم تصنيف You can test it on Mishkal Site, choose: Tool > extraction > Classify. Naftawayh Demo

Installation

pip install naftawayh

Usage

import naftawayh.wordtag as wordtag

Test word list

>>> import naftawayh.wordtag
>>> word_list=(u'بالبلاد', u'بينما', u'أو', u'انسحاب', u'انعدام',
u'انفجار', u'البرنامج', u'بانفعالاتها', u'العربي', u'الصرفي',
u'التطرف', u'اقتصادي', )
>>> tagger = naftawayh.wordtag.WordTagger();
>>> # test all words
>>> list_tags = tagger.word_tagging(word_list)
>>> for word, tag in zip(word_list, list_tags):
>>>     print word, tag
بالبلاد n
بينما vn3
أو t
انسحاب n
انعدام n
انفجار n
البرنامج n
بانفعالاتها n
العربي n
الصرفي n
التطرف n
اقتصادي n
  • Test word by word

>>> import naftawayh.wordtag
>>> word_list=(u'بالبلاد', u'بينما', u'أو', u'انسحاب', u'انعدام',
u'انفجار', u'البرنامج', u'بانفعالاتها', u'العربي', u'الصرفي',
u'التطرف', u'اقتصادي', )
>>> tagger = naftawayh.wordtag.WordTagger();
>>> #test word by word
>>> for word in word_list:
>>>     if tagger.is_noun(word):
>>>         print(u'%s is noun'%word)
>>>     if tagger.is_verb(word):
>>>         print(u'%s is verb'%word)
>>>     if tagger.is_stopword(word):
>>>         print(u'%s is stopword'%word)
بالبلاد is noun
بينما is noun
بينما is verb
أو is noun
أو is verb
أو is stopword
انسحاب is noun
انعدام is noun
انفجار is noun
البرنامج is noun
بانفعالاتها is noun
العربي is noun
الصرفي is noun
التطرف is noun
اقتصادي is noun
  • Test word in context

>>> import naftawayh.wordtag
>>> word_list=(u'بالبلاد', u'بينما', u'أو', u'انسحاب', u'انعدام',
u'انفجار', u'البرنامج', u'بانفعالاتها', u'العربي', u'الصرفي',
u'التطرف', u'اقتصادي', )
>>> tagger = naftawayh.wordtag.WordTagger();
>>> previous_word=""
>>> print (" **** test words in context***")
>>> # test words in context
>>> for word in word_list:
>>>     tag=tagger.context_analyse(previous_word,word);
>>>     print(u"%s from context is %s "%(word,tag))
>>>     previous_word=word;
**** test words in context***
بالبلاد from context is vn
بينما from context is vn
أو from context is vn
انسحاب from context is vn
انعدام from context is vn
انفجار from context is vn
البرنامج from context is vn
بانفعالاتها from context is vn
العربي from context is vn
الصرفي from context is vn
التطرف from context is vn
اقتصادي from context is vn

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Naftawayh-0.4.tar.gz (319.8 kB view details)

Uploaded Source

Built Distributions

Naftawayh-0.4-py3-none-any.whl (332.6 kB view details)

Uploaded Python 3

Naftawayh-0.4-py2-none-any.whl (329.4 kB view details)

Uploaded Python 2

File details

Details for the file Naftawayh-0.4.tar.gz.

File metadata

  • Download URL: Naftawayh-0.4.tar.gz
  • Upload date:
  • Size: 319.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.19.9 CPython/2.7.12

File hashes

Hashes for Naftawayh-0.4.tar.gz
Algorithm Hash digest
SHA256 752cea9a5b2c9f40676cafbc8a020b24e1595e70fa124e28ce208bb6a91edae2
MD5 796a09ab17f25a688339091bc3bc3baf
BLAKE2b-256 68de0b22ee28fc925098c1f66ef43e8cb1a40d92b8525f19aa1ca35289429e6c

See more details on using hashes here.

File details

Details for the file Naftawayh-0.4-py3-none-any.whl.

File metadata

  • Download URL: Naftawayh-0.4-py3-none-any.whl
  • Upload date:
  • Size: 332.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.19.9 CPython/2.7.12

File hashes

Hashes for Naftawayh-0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 410bcbaf5f775785239642a2549e459d04b64c196f3e224eafdc95d2e52d1de6
MD5 2b44cdf5ff34d9461bfae2775bb40b5c
BLAKE2b-256 836a48363cbd62ab5b167472d8d2bfbc986bf7dd36eac9111fd5bef96508d01c

See more details on using hashes here.

File details

Details for the file Naftawayh-0.4-py2-none-any.whl.

File metadata

  • Download URL: Naftawayh-0.4-py2-none-any.whl
  • Upload date:
  • Size: 329.4 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.4.2 requests/2.19.1 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.19.9 CPython/2.7.12

File hashes

Hashes for Naftawayh-0.4-py2-none-any.whl
Algorithm Hash digest
SHA256 98eaf159bea3f7bf82ab62a9f171408cdb431979a6265ef7560071e8137ef992
MD5 2a7643cd68fc62f8c9dfd92e84e26fc1
BLAKE2b-256 6b678319ba18cb1c507bea1d4d4a85b0e9c8921e9c43c8283316b4e653cba4cc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page