Skip to main content

Ontonotes-5-parsing: parser of Ontonotes 5.0 to transform this corpus to a simple JSON format.

Project description

A simple parser of the famous Ontonotes 5 dataset https://catalog.ldc.upenn.edu/LDC2013T19

This dataset is very useful for experiments with NER, i.e. Named Entity Recognition. Besides, Ontonotes 5 includes three languages (English, Arabic, and Chinese), and this fact increases interest to use it in experiments with multi-lingual NER. But the source format of Ontonotes 5 is very intricate, in my view. Conformably, the goal of this project is the creation of a special parser to transform Ontonotes 5 into a simple JSON format. In this format, each annotated sentence is represented as a dictionary with five keys: text, morphology, syntax, entities, and language. In their’s turn, morphology, syntax, and entities are specified as dictionaries too, where each dictionary describes labels (part-of-speech labels, syntactical tags, or entity classes) and their bounds in the corresponded text.

You can read more detailed information about this Ontonotes 5 parser in the small documentation https://github.com/nsu-ai/ontonotes-5-parsing/blob/master/readme.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ontonotes-5-parsing-0.0.4.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

ontonotes_5_parsing-0.0.4-py3-none-any.whl (16.0 kB view details)

Uploaded Python 3

File details

Details for the file ontonotes-5-parsing-0.0.4.tar.gz.

File metadata

  • Download URL: ontonotes-5-parsing-0.0.4.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.23.0 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.6

File hashes

Hashes for ontonotes-5-parsing-0.0.4.tar.gz
Algorithm Hash digest
SHA256 489c4a6b2915496c3e3b7419a00c909e28ff1fb0231f50da6ca8373350eadc14
MD5 51271f8ba528b574891f639c7a3ae985
BLAKE2b-256 d763be0dc965ccd194eca9d4bbd92ad9c72dc122076e87bdf22aafe8a7b1bf55

See more details on using hashes here.

File details

Details for the file ontonotes_5_parsing-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: ontonotes_5_parsing-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 16.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.23.0 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.6

File hashes

Hashes for ontonotes_5_parsing-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 5cbad455a9a53cbbce605f4dd3d5c2ef027611fbf479a000e34805e468d8cfc9
MD5 f4ab68b2afa7e239820240b13aafe752
BLAKE2b-256 f1a9a938f64892cceec678e225e3722d270677fe36ad25ba951358f1007ec910

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page