Skip to main content

Converter of UIMA CAS XMI files from INCEpTION with nested NER tags, NEL tags and components into IOB TSV files

Project description

CAS2IOB

CAS2IOB is a converter of UIMA CAS XMI files using in the INCEpTION annotation platform into IOB TSV files. In contrast to the internal convertor in INCEpTION, it handles the nested NER tags, NEL tags and components, and saves them into multiple columns of a TSV-file:

TOKEN  NE-COARSE   NE-FINE NE-FINE-COMP    NE-NESTED   NEL-WikidataQID

It reads the UIMA CAS XMI files using dkpro-cassis library.

Installation

pip install cas2iob

Using as a library

Import cas2iob:

import cas2iob

Convert ./input.xmi with ./TypeSystem.xml into ./output.tsv:

cas2iob.file('./input.xmi', 'output.tsv')

Convert all files in ./input folder with ./TypeSystem.xml into ./output folder:

cas2iob.folder('./input', './output')

If ./TypeSystem.xml is located in a different folder, add it to the commands above as the third argument.

If you don't want to include column names in a TSV-file, add the forth argument metadata=False.

Using in CLI

% cas2iob --help
                                                                                
 Usage: cas2iob [OPTIONS] INPUT_PATH OUTPUT_PATH [TYPESYSTEM_XML] [METADATA]    
                                                                                
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│ *    input_path          PATH              [default: None] [required]        │
│ *    output_path         PATH              [default: None] [required]        │
│      typesystem_xml      [TYPESYSTEM_XML]  [default: ./TypeSystem.xml]       │
│      metadata            [METADATA]        [default: True]                   │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --install-completion          Install completion for the current shell.      │
│ --show-completion             Show completion for the current shell, to copy │
│                               it or customize the installation.              │
│ --help                        Show this message and exit.                    │
╰──────────────────────────────────────────────────────────────────────────────╯

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cas2iob-0.1.0.tar.gz (4.5 kB view hashes)

Uploaded Source

Built Distribution

cas2iob-0.1.0-py3-none-any.whl (5.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page