Skip to main content

OpenPecha Toolkit allows state of the art for distributed standoff annotations on moving texts

Project description

OpenPecha Toolkit

PyPI version Test Test Coverage Publish Code style: black

OpenPecha Toolkit allows state of the art for distributed standoff annotations on moving texts, in which Base layer can be edited without affecting annotations.

The motivation for this project it that for perfect base-text, there no big obstacles but the technical problems come in when you have to be able to edit the base-text, which can be correcting or updating the base-text. So the existing solution like using character coordinates won’t work. So we purposed the CCTV (Character Coordinate Translation Vector) to track the annotations from source base-text to edited base-text without worrying about the annotations at all. Then user can export the edited based text with updated annotations in various docuemnt format like .md, .epub, .pdf, etc. But currently it supports only markdown file.

For NLP this toolkit will provide a way to have annoated corpra with minimal errors and extract a particular type of annotation or collection of different type of annotations. NLP researchers can then use these corpus to build language model, annotations to build NER model, entity linking, ect.

Prerequisite

  • Python3, you can download from here

Installation

Usage

First, we need to download all the pecha from OpenPecha.

$ openpecha download --help
Usage: openpecha download [OPTIONS]

  Command to download poti. You need to give a work-id of a poti to download it.

Options:
  -n, --number WORK_ID      Work-id of the poti, for single poti download
  --help                    Show this message

Automatic updating annotations from source base-text (original) and destination base-text (edited)

$ openpecha update --help
Usage: openpecha update [OPTIONS] WORK_ID

  Command to update the base text with your edits.

Options:
  --help  Show this message and exit.

Exporting and Extracting layer

$ openpecha layer --help 
Usage: openpecha layer [OPTIONS] WORK_ID OUT

  Command to apply a single layer, multiple layers or all available layers
  (by default) and then export to markdown.

  Args:

      - WORK_ID is the work-id of the poti, from which given layer will be
      applied

      - OUT is the filename to the write the result. Currently support only
      Markdown file.

Options:
  -n, --name [title|tsawa|yigchung|quotes|sapche]
                                  name of a layer to be applied
  -l, --list TEXT                 list of name of layers to applied,
                                  name of layers should be comma separated
  --help                          Show this message and exit.

Developer Installation.

$ git clone https://github.com/OpenPoti/openpecha-toolkit.git
$ cd openpecha-toolkit
$ pip install -r requirements.txt
$ pip install -e .

Testing

$ pytest tests

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openpecha-0.7.83.tar.gz (89.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openpecha-0.7.83-py3-none-any.whl (111.5 kB view details)

Uploaded Python 3

File details

Details for the file openpecha-0.7.83.tar.gz.

File metadata

  • Download URL: openpecha-0.7.83.tar.gz
  • Upload date:
  • Size: 89.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.12

File hashes

Hashes for openpecha-0.7.83.tar.gz
Algorithm Hash digest
SHA256 f2cc46040753b167c73aac937d50cb71be7e11675b3a5ed7a3348d2bea5a83a5
MD5 3250246b9d08f5341ce4a5e29850d46c
BLAKE2b-256 a1cfccb2afb5dfc9a495cb825f5ee3f54f858f04f0336bfac83f72350e102896

See more details on using hashes here.

File details

Details for the file openpecha-0.7.83-py3-none-any.whl.

File metadata

  • Download URL: openpecha-0.7.83-py3-none-any.whl
  • Upload date:
  • Size: 111.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.12

File hashes

Hashes for openpecha-0.7.83-py3-none-any.whl
Algorithm Hash digest
SHA256 85c5c3326fb0cf479b41802d55f6f5295634ceb397ee3956c4ba7013d416b6b5
MD5 667b4a7fbc3de17472df73e60490897e
BLAKE2b-256 4a900546ea314c5fbda92270183b5c7d8975d6d60642726c78fadf0e5eb75631

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page