Skip to main content

Converter of UIMA CAS XMI files from INCEpTION with nested NER tags, NEL tags and components into IOB TSV files

Project description

CAS2IOB

CAS2IOB is a converter of UIMA CAS XMI files using in the INCEpTION annotation platform into IOB TSV files. In contrast to the internal convertor in INCEpTION, it handles the nested NER tags, NEL tags and components, and saves them into multiple columns of a TSV-file:

TOKEN  NE-COARSE   NE-FINE NE-FINE-COMP    NE-NESTED   NEL-WikidataQID

It reads the UIMA CAS XMI files using dkpro-cassis library.

Installation

pip install cas2iob

Using as a library

Import cas2iob:

import cas2iob

Convert ./input.xmi with ./TypeSystem.xml into ./output.tsv:

cas2iob.file('./input.xmi', 'output.tsv')

Convert all files in ./input folder with ./TypeSystem.xml into ./output folder:

cas2iob.folder('./input', './output')

If ./TypeSystem.xml is located in a different folder, add it to the commands above as the third argument.

If you don't want to include column names in a TSV-file, add the forth argument metadata=False.

Using in CLI

% cas2iob --help
                                                                                
 Usage: cas2iob [OPTIONS] INPUT_PATH OUTPUT_PATH [TYPESYSTEM_XML] [METADATA]    
                                                                                
╭─ Arguments ──────────────────────────────────────────────────────────────────╮
│ *    input_path          PATH              [default: None] [required]        │
│ *    output_path         PATH              [default: None] [required]        │
│      typesystem_xml      [TYPESYSTEM_XML]  [default: ./TypeSystem.xml]       │
│      metadata            [METADATA]        [default: True]                   │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────╮
│ --install-completion          Install completion for the current shell.      │
│ --show-completion             Show completion for the current shell, to copy │
│                               it or customize the installation.              │
│ --help                        Show this message and exit.                    │
╰──────────────────────────────────────────────────────────────────────────────╯

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cas2iob-0.1.0.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

cas2iob-0.1.0-py3-none-any.whl (5.1 kB view details)

Uploaded Python 3

File details

Details for the file cas2iob-0.1.0.tar.gz.

File metadata

  • Download URL: cas2iob-0.1.0.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.6

File hashes

Hashes for cas2iob-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e612775fa0ecc2f1e0bfc46621bf241a2e61142175a963d76e4a8170a217f4ff
MD5 3ebfd815860508650cbca71a09aa2aa1
BLAKE2b-256 60c1cebcda20a6aaefb95f1299339c4022247b7fa60b6ce999c7c83f1f840491

See more details on using hashes here.

File details

Details for the file cas2iob-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: cas2iob-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.6

File hashes

Hashes for cas2iob-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 047e8af515c4476051bd5b2e01aae0a637b47d196406d6547057d87575c47f53
MD5 4da2700c8d82ff88dd8bd305affed578
BLAKE2b-256 a2e2dc63ca0a91590c76c4cc1502bfdacfd808f88fadcaa82de2f8de5ec555a7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page