Skip to main content

Korean Dataset Loading Inferface

Project description

KODALI(Korean Dataset Loading Interface)

1. 지원 데이터 셋

개체명 인식

  • 모두의 말뭉치(Korean Corpus)
  • KLUE-NER(KLUE)

관계 추출

  • AI Hub: 한국어 지식기반 관계 데이터
  • KLUE-RE

2. How to use

개체명 인식

dataset = Kodali(path={PATH}, task="ner", source={DATASET_SOURCE})

# Source List
# - 모두의 말뭉치 : "korean-corpus"
# - KLUE-NER : "klue"

3. 데이터 포맷

개체명 인식

개체명 인식 데이터 셋은 NerOutputs, NerData, NerEntity 포맷을 지원합니다.
NerEntity는 엔티티 정보를 가지고 있고, NerData는 문장과 문장 내 엔티티들을 포함합니다. NerOutputs는 NerData 목록을 가지고 있으며, 데이터를 입력할 시, 데이터의 크기를 자동으로 계산합니다.

NerOutputs
    data: List[NerData]
    size: int = 0

NerData:
    sentence: str
    entities: List[NerEntity]

NerEntity:
    word: str
    label: str
    start_idx: int
    end_idx: int

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kodali-0.3.0.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

kodali-0.3.0-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file kodali-0.3.0.tar.gz.

File metadata

  • Download URL: kodali-0.3.0.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.9.14 Linux/6.2.0-26-generic

File hashes

Hashes for kodali-0.3.0.tar.gz
Algorithm Hash digest
SHA256 1293dfe837c1744f0ad670009b562095ca77493d0d48b3280c6e9e3c909e0774
MD5 2c3b1725d370110c317171d61b54c841
BLAKE2b-256 b1b0147218c674f1f7e486b25303d5cd219bc6d2c967ae1d92319e3901a19a2c

See more details on using hashes here.

File details

Details for the file kodali-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: kodali-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.9.14 Linux/6.2.0-26-generic

File hashes

Hashes for kodali-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3940b4ad8567855969b998b4ba2f5d4960a89cfdc6f0b574a9f3709124a9abd8
MD5 1d4ad688d44fa2eaebe87788a0db559c
BLAKE2b-256 0cb369e39ee9cc08d8ead56661ce77166785b863b15b2c8a213ad931390dabd9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page