Converter of UIMA CAS XMI files from INCEpTION with nested NER tags, NEL tags and components into IOB TSV files
Project description
CAS2IOB
CAS2IOB is a converter of UIMA CAS XMI files using in the INCEpTION annotation platform into IOB TSV files. In contrast to the internal convertor in INCEpTION, it handles the nested NER tags, NEL tags and components, and saves them into multiple columns of a TSV-file:
TOKEN NE-COARSE NE-FINE NE-FINE-COMP NE-NESTED NEL-WikidataQID
It reads the UIMA CAS XMI files using dkpro-cassis library.
Installation
pip install cas2iob
Using as a library
Import cas2iob:
import cas2iob
Convert ./input.xmi
with ./TypeSystem.xml
into ./output.tsv
:
cas2iob.file('./input.xmi', 'output.tsv')
Convert all files in ./input
folder with ./TypeSystem.xml
into ./output
folder:
cas2iob.folder('./input', './output')
If ./TypeSystem.xml
is located in a different folder, add it to the commands above as the third argument.
If you don't want to include column names in a TSV-file, add the forth argument metadata=False
.
Using in CLI
% cas2iob --help
Usage: cas2iob [OPTIONS] INPUT_PATH OUTPUT_PATH [TYPESYSTEM_XML] [METADATA]
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│ * input_path PATH [default: None] [required] │
│ * output_path PATH [default: None] [required] │
│ typesystem_xml [TYPESYSTEM_XML] [default: ./TypeSystem.xml] │
│ metadata [METADATA] [default: True] │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --install-completion Install completion for the current shell. │
│ --show-completion Show completion for the current shell, to copy │
│ it or customize the installation. │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────╯
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cas2iob-0.1.0.tar.gz
.
File metadata
- Download URL: cas2iob-0.1.0.tar.gz
- Upload date:
- Size: 4.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e612775fa0ecc2f1e0bfc46621bf241a2e61142175a963d76e4a8170a217f4ff |
|
MD5 | 3ebfd815860508650cbca71a09aa2aa1 |
|
BLAKE2b-256 | 60c1cebcda20a6aaefb95f1299339c4022247b7fa60b6ce999c7c83f1f840491 |
File details
Details for the file cas2iob-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: cas2iob-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 047e8af515c4476051bd5b2e01aae0a637b47d196406d6547057d87575c47f53 |
|
MD5 | 4da2700c8d82ff88dd8bd305affed578 |
|
BLAKE2b-256 | a2e2dc63ca0a91590c76c4cc1502bfdacfd808f88fadcaa82de2f8de5ec555a7 |