Skip to main content

Ontonotes-5-parsing: parser of Ontonotes 5.0 to transform this corpus to a simple JSON format.

Project description

A simple parser of the famous Ontonotes 5 dataset https://catalog.ldc.upenn.edu/LDC2013T19

This dataset is very useful for experiments with NER, i.e. Named Entity Recognition. Besides, Ontonotes 5 includes three languages (English, Arabic, and Chinese), and this fact increases interest to use it in experiments with multi-lingual NER. But the source format of Ontonotes 5 is very intricate, in my view. Conformably, the goal of this project is the creation of a special parser to transform Ontonotes 5 into a simple JSON format. In this format, each annotated sentence is represented as a dictionary with five keys: text, morphology, syntax, entities, and language. In their’s turn, morphology, syntax, and entities are specified as dictionaries too, where each dictionary describes labels (part-of-speech labels, syntactical tags, or entity classes) and their bounds in the corresponded text.

You can read more detailed information about this Ontonotes 5 parser in the small documentation https://github.com/nsu-ai/ontonotes-5-parsing/blob/master/readme.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ontonotes-5-parsing-0.0.5.tar.gz (12.3 kB view details)

Uploaded Source

Built Distribution

ontonotes_5_parsing-0.0.5-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file ontonotes-5-parsing-0.0.5.tar.gz.

File metadata

  • Download URL: ontonotes-5-parsing-0.0.5.tar.gz
  • Upload date:
  • Size: 12.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.23.0 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.6

File hashes

Hashes for ontonotes-5-parsing-0.0.5.tar.gz
Algorithm Hash digest
SHA256 620c2b6efea7f2e0edcb19bc1def91efc9232897e969c91b858d02241cd7f738
MD5 dfaef59f2ec5a7da5f6439a58deac635
BLAKE2b-256 cf97edf5c59ddebeeef0162867b17050b71c845694a4b7735dc5ea69b1bd8700

See more details on using hashes here.

File details

Details for the file ontonotes_5_parsing-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: ontonotes_5_parsing-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 16.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.23.0 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.6

File hashes

Hashes for ontonotes_5_parsing-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 1379683fa315418313006e75c678f74a3a9e7b853c3996ccab4891eb1acce3f7
MD5 52c9c3838f2d7db6bc696735f20fea63
BLAKE2b-256 5bee886a2b5faea82c9e665caf055dcdd868ce806682b1f44f6bd898d7405c3c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page