Skip to main content

Korean Dataset Loading Inferface

Project description

KODALI(Korean Dataset Loading Interface)

1. 지원 데이터 셋

개체명 인식

  • 모두의 말뭉치(Korean Corpus)
  • KLUE-NER(KLUE)

관계 추출

  • AI Hub: 한국어 지식기반 관계 데이터
  • KLUE-RE

2. How to use

개체명 인식

dataset = Kodali(path={PATH}, task="ner", source={DATASET_SOURCE})

# Source List
# - 모두의 말뭉치 : "korean-corpus"
# - KLUE-NER : "klue"

3. 데이터 포맷

개체명 인식

개체명 인식 데이터 셋은 NerOutputs, NerData, NerEntity 포맷을 지원합니다.
NerEntity는 엔티티 정보를 가지고 있고, NerData는 문장과 문장 내 엔티티들을 포함합니다. NerOutputs는 NerData 목록을 가지고 있으며, 데이터를 입력할 시, 데이터의 크기를 자동으로 계산합니다.

NerOutputs
    data: List[NerData]
    size: int = 0

NerData:
    sentence: str
    entities: List[NerEntity]

NerEntity:
    word: str
    label: str
    start_idx: int
    end_idx: int

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kodali-0.2.0.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

kodali-0.2.0-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file kodali-0.2.0.tar.gz.

File metadata

  • Download URL: kodali-0.2.0.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.9.14 Linux/6.2.0-26-generic

File hashes

Hashes for kodali-0.2.0.tar.gz
Algorithm Hash digest
SHA256 66cf7ac774701973108dfbd72b868c0bef8128839d417e8eb25d2e87ef7849d0
MD5 a371ea655552b4a1a94b432ed6f10f80
BLAKE2b-256 88f90e072ff523d575da087be34269e221d855bc9c48b5beccbb1c59ee5c851f

See more details on using hashes here.

File details

Details for the file kodali-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: kodali-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 10.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.9.14 Linux/6.2.0-26-generic

File hashes

Hashes for kodali-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 100daf1336035ce5ace93edeb8fc7b2bebadcc116b7ac13bcef9c94c9bff3711
MD5 053362ff930dd611cb9f07e5c96f5c3c
BLAKE2b-256 e228cb5c5e2a987ec111603f74ef8d8e26a5c4c93d2b46dd498145f790668f40

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page