Skip to main content

A python library for named entity recognition evaluation

Project description

miNER

A python library for NER (Named Entity Recognition) evaluation

We can evaluate the performance of NER by distinguishing between known entities and unknown entities using this library.

Support

  • Tagging Scheme
    • IOB2
    • BIOES
    • BIOUL
  • metrics
    • precision
    • recall
    • f1

Requirements

  • python3
  • cython

Installation

pip install mi-ner

Usage

Sample

>>> from miner import Miner
>>> answers = [
    'B-PSN O O B-LOC O O O O'.split(' '),
    'B-PSN I-PSN O O B-LOC I-LOC O O O O'.split(' '),
    'S-PSN O O S-PSN O O B-LOC I-LOC E-LOC O O O O'.split(' ')
]
>>> predicts = [
    'B-PSN O O B-LOC O O O O'.split(' '),
    'B-PSN B-PSN O O B-LOC I-LOC O O O O'.split(' '),
    'S-PSN O O O O O B-LOC I-LOC E-LOC O O O O'.split(' ')
]
>>> sentences = [
    '花子 さん は 東京 に 行き まし た'.split(' '),
    '山田 太郎 君 は 東京 駅 に 向かい まし た'.split(' '),
    '花子 さん と ボブ くん は 東京 スカイ ツリー に 行き まし た'.split(' '),
]
>>> knowns = {'PSN': ['花子'], 'LOC': ['東京']} # known words (words included in training data)
>>> m = Miner(answers, predicts, sentences, knowns)
>>> m.default_report(True)
	precision    recall    f1_score   num
PSN	 0.500        0.500     0.500      4
LOC	 1.000        1.000     1.000      3
{'PSN': {'precision': 0.5, 'recall': 0.5, 'f1_score': 0.5, 'num': 4}, 'LOC': {'precision': 1.0, 'recall': 1.0, 'f1_score': 1.0, 'num': 3}}
>>> m.return_predict_named_entities()
{'known': {'PSN': ['花子'], 'LOC': ['東京']}, 'unknown': {'PSN': ['太郎', '山田'], 'LOC': ['東京駅', '東京スカイツリー']}}

Methods

method description
default_report(print_) return result of named entity recognition. if print_=True, showing result
known_only_report(print_) return result of known named entity recognition.
unknown_only_report(print_) return result of unknown named entity recognition.
return_predict_named_entities() return named entities along predicted label(predicts).
return_answer_named_entities() return named entities along answer label(answer).
return_miss_labelings() return miss labeling sentences.
segmentation_score(mode) show parcentages of matching answer and predict labels. if known or unknown for mode, return labeling accuracy for known or unknown NE.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mi-ner-0.3.0.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

mi_ner-0.3.0-cp37-cp37m-macosx_10_14_x86_64.whl (28.1 kB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file mi-ner-0.3.0.tar.gz.

File metadata

  • Download URL: mi-ner-0.3.0.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for mi-ner-0.3.0.tar.gz
Algorithm Hash digest
SHA256 445c2f2eaed1710a6bc211ce6eafb4d9100f0b892b9c1cd34c1a062bfc6640f6
MD5 289e4a7cce85d0b564d98ed27628670f
BLAKE2b-256 f43657dae48f4eb0cbd8567eae3634644305b3d3032e7ff8ff0f091d2c437402

See more details on using hashes here.

File details

Details for the file mi_ner-0.3.0-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: mi_ner-0.3.0-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 28.1 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for mi_ner-0.3.0-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 73db2a8e0cd10e47de2bdd062a37b21fbf61e49dc5a0e959320b071dcf0d2069
MD5 d43d330aa81f02c4259badb20cdd7ec4
BLAKE2b-256 d8c4e358d10f3b418a784d23201a12821e5dd9472b850cd8625286739d83a5a6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page