Skip to main content

Persian Part-of-Speech tagger framework

Project description

Persian Parts-of-Speech tagger

Scrutinizer Code Quality Code Coverage Build Status Code Intelligence Status Maintainability Last commit ask

This repository contains Persian Part of Speech tagger based on Conditional Random Fields and a native Text Normalizer.

Table of Contents

  1. TO-DO
  2. Installation
    1. Using Pip
    2. From Source
    3. Nn CoLab
  3. Usage
  4. Evaluation

TO-DO:

Installation:

Using Pip

! pip install crf_pos

From Source

$ git clone https://github.com/MohammadForouhesh/crf-pos-persian 
$ cd crf-pos-persian
$ python setup.py install

On CoLab

! pip install git+https://github.com/MohammadForouhesh/crf-pos-persian.git

Usage

from crf_pos.pos_tagger.wapiti import WapitiPosTagger
pos_tagger = WapitiPosTagger()
tokens = text = 'او رئیس‌جمهور حجتالاسلاموالمسلمین ابرهیم رئیسی رئیس جمهور ایران اسلامی می باشد'.split()
pos_tagger[tokens]

[1]: 
[('او', 'PRO'),
('رئیس\u200cجمهور', 'N'),
('حجت\u200cالاسلام\u200cوالمسلمین', 'N'),
('ابرهیم', 'N'),
('رئیسی', 'N'),
('رئیس\u200cجمهور', 'N'),
('ایران', 'N'),
('اسلامی', 'ADJ'),
('می\u200cباشد', 'V')]

Evaluation

Part-of-Speech precision recall f1-score support
N 0.985 0.970 0.977 186585
P 0.998 0.998 0.998 89450
V 0.999 0.999 0.999 87762
ADV 0.976 0.972 0.974 15983
ADVe 0.988 0.978 0.983 1053
RES 0.989 0.992 0.991 2784
RESe 1.000 0.989 0.994 174
DET 0.973 0.977 0.975 19786
DETe 0.960 0.970 0.965 2156
AJ 0.978 0.975 0.977 61526
AJe 0.949 0.964 0.957 19919
CL 0.932 0.918 0.925 1892
INT 1.000 1.000 1.000 73
CONJ 0.996 0.997 0.997 74796
CONJe 1.000 1.000 1.000 82
POSTP 1.000 1.000 1.000 13174
PRO 0.973 0.974 0.973 23094
PROe 0.878 0.579 0.698 273
NUM 0.988 0.992 0.990 24864
NUMe 0.932 0.918 0.925 2519
PUNC 1.000 1.000 1.000 84088
Ne 0.970 0.985 0.977 163760
Pe 0.986 0.992 0.989 10004
avg/total 0.985 0.985 0.985 885797

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crf_pos-2.0.1.tar.gz (12.6 kB view hashes)

Uploaded Source

Built Distribution

crf_pos-2.0.1-py3-none-any.whl (13.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page