Skip to main content

Korean Dataset Loading Inferface

Project description

KODALI(Korean Dataset Loading Interface)

1. 지원 데이터 셋

개체명 인식

  • 모두의 말뭉치(Korean Corpus)
  • KLUE-NER(KLUE)

관계 추출

  • AI Hub: 한국어 지식기반 관계 데이터
  • KLUE-RE

2. How to use

개체명 인식

dataset = Kodali(path={PATH}, task="ner", source={DATASET_SOURCE})

# Source List
# - 모두의 말뭉치 : "korean-corpus"
# - KLUE-NER : "klue"

3. 데이터 포맷

개체명 인식

개체명 인식 데이터 셋은 NerOutputs, NerData, NerEntity 포맷을 지원합니다.
NerEntity는 엔티티 정보를 가지고 있고, NerData는 문장과 문장 내 엔티티들을 포함합니다. NerOutputs는 NerData 목록을 가지고 있으며, 데이터를 입력할 시, 데이터의 크기를 자동으로 계산합니다.

NerOutputs
    data: List[NerData]
    size: int = 0

NerData:
    sentence: str
    entities: List[NerEntity]

NerEntity:
    word: str
    label: str
    start_idx: int
    end_idx: int

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kodali-0.2.1.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

kodali-0.2.1-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file kodali-0.2.1.tar.gz.

File metadata

  • Download URL: kodali-0.2.1.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.9.14 Linux/6.2.0-26-generic

File hashes

Hashes for kodali-0.2.1.tar.gz
Algorithm Hash digest
SHA256 bafa6b8ff6fc836779ab9aa64f405da73524b9fd606ea155cfa1c376ed366e0b
MD5 2f2443af2fa75dffb95feb8dfad7c6fd
BLAKE2b-256 cb6baa9abf8fd4fc4d921420f8e7dcb8a5bfe91b0cc9913915038134dcc51fd7

See more details on using hashes here.

File details

Details for the file kodali-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: kodali-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.9.14 Linux/6.2.0-26-generic

File hashes

Hashes for kodali-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 170ee49a2b1c1102444445899e26c45f34e84d84e665dcca0ceee57c016ba874
MD5 32177c73fce0a5984c553d14802e6a3c
BLAKE2b-256 c815055a49160e9d0afe93cd270ebc091f3183fa372a73e2fb86574836c5c08e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page