Ontonotes-5-parsing: parser of Ontonotes 5.0 to transform this corpus to a simple JSON format.
Project description
A simple parser of the famous Ontonotes 5 dataset https://catalog.ldc.upenn.edu/LDC2013T19
This dataset is very useful for experiments with NER, i.e. Named Entity Recognition. Besides, Ontonotes 5 includes three languages (English, Arabic, and Chinese), and this fact increases interest to use it in experiments with multi-lingual NER. But the source format of Ontonotes 5 is very intricate, in my view. Conformably, the goal of this project is the creation of a special parser to transform Ontonotes 5 into a simple JSON format. In this format, each annotated sentence is represented as a dictionary with five keys: text, morphology, syntax, entities, and language. In their’s turn, morphology, syntax, and entities are specified as dictionaries too, where each dictionary describes labels (part-of-speech labels, syntactical tags, or entity classes) and their bounds in the corresponded text.
You can read more detailed information about this Ontonotes 5 parser in the small documentation https://github.com/nsu-ai/ontonotes-5-parsing/blob/master/readme.md
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ontonotes-5-parsing-0.0.4.tar.gz
.
File metadata
- Download URL: ontonotes-5-parsing-0.0.4.tar.gz
- Upload date:
- Size: 12.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.23.0 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 489c4a6b2915496c3e3b7419a00c909e28ff1fb0231f50da6ca8373350eadc14 |
|
MD5 | 51271f8ba528b574891f639c7a3ae985 |
|
BLAKE2b-256 | d763be0dc965ccd194eca9d4bbd92ad9c72dc122076e87bdf22aafe8a7b1bf55 |
File details
Details for the file ontonotes_5_parsing-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: ontonotes_5_parsing-0.0.4-py3-none-any.whl
- Upload date:
- Size: 16.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.23.0 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5cbad455a9a53cbbce605f4dd3d5c2ef027611fbf479a000e34805e468d8cfc9 |
|
MD5 | f4ab68b2afa7e239820240b13aafe752 |
|
BLAKE2b-256 | f1a9a938f64892cceec678e225e3722d270677fe36ad25ba951358f1007ec910 |