Skip to main content

Korean Dataset Loading Inferface

Project description

KODALI(Korean Dataset Loading Interface)

1. 지원 데이터 셋

개체명 인식

  • 모두의 말뭉치(Korean Corpus)
  • KLUE-NER(KLUE)

관계 추출

  • AI Hub: 한국어 지식기반 관계 데이터
  • KLUE-RE

2. How to use

개체명 인식

dataset = Kodali(path={PATH}, task="ner", source={DATASET_SOURCE})

# Source List
# - 모두의 말뭉치 : "korean-corpus"
# - KLUE-NER : "klue"

3. 데이터 포맷

개체명 인식

개체명 인식 데이터 셋은 NerOutputs, NerData, NerEntity 포맷을 지원합니다.
NerEntity는 엔티티 정보를 가지고 있고, NerData는 문장과 문장 내 엔티티들을 포함합니다. NerOutputs는 NerData 목록을 가지고 있으며, 데이터를 입력할 시, 데이터의 크기를 자동으로 계산합니다.

NerOutputs
    data: List[NerData]
    size: int = 0

NerData:
    sentence: str
    entities: List[NerEntity]

NerEntity:
    word: str
    label: str
    start_idx: int
    end_idx: int

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kodali-0.1.0.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

kodali-0.1.0-py3-none-any.whl (10.0 kB view details)

Uploaded Python 3

File details

Details for the file kodali-0.1.0.tar.gz.

File metadata

  • Download URL: kodali-0.1.0.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.9.14 Linux/6.2.0-26-generic

File hashes

Hashes for kodali-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9421ae9e5ba30c34cbe4a3b069ad1eaa9d2d274ea42f1e730dc8eccbd80396d0
MD5 0d351eee829011d8abcceaa67bda50c3
BLAKE2b-256 12e3e826a70bd36f669bd3b816c7afd434535c12a3f58a1060165793eddaea04

See more details on using hashes here.

File details

Details for the file kodali-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: kodali-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.9.14 Linux/6.2.0-26-generic

File hashes

Hashes for kodali-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 47d67d1290ea8bd574cb8b351924f183a68c926537fc278d26f5bdcee378611b
MD5 56245650b89058d7e7895d809c23f698
BLAKE2b-256 935cac3f13c7923b6bd8ea38d480445bcbed82cf418cb48536a7c9504e373750

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page