Korean Dataset Loading Inferface
Project description
KODALI(Korean Dataset Loading Interface)
1. 지원 데이터 셋
개체명 인식
- 모두의 말뭉치(Korean Corpus)
- KLUE-NER(KLUE)
관계 추출
- AI Hub: 한국어 지식기반 관계 데이터
- KLUE-RE
2. How to use
개체명 인식
dataset = Kodali(path={PATH}, task="ner", source={DATASET_SOURCE})
# Source List
# - 모두의 말뭉치 : "korean-corpus"
# - KLUE-NER : "klue"
3. 데이터 포맷
개체명 인식
개체명 인식 데이터 셋은 NerOutputs, NerData, NerEntity 포맷을 지원합니다.
NerEntity는 엔티티 정보를 가지고 있고, NerData는 문장과 문장 내 엔티티들을 포함합니다. NerOutputs는 NerData 목록을 가지고 있으며, 데이터를 입력할 시, 데이터의 크기를 자동으로 계산합니다.
NerOutputs
data: List[NerData]
size: int = 0
NerData:
sentence: str
entities: List[NerEntity]
NerEntity:
word: str
label: str
start_idx: int
end_idx: int
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
kodali-0.2.1.tar.gz
(5.4 kB
view details)
Built Distribution
kodali-0.2.1-py3-none-any.whl
(10.6 kB
view details)
File details
Details for the file kodali-0.2.1.tar.gz
.
File metadata
- Download URL: kodali-0.2.1.tar.gz
- Upload date:
- Size: 5.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.9.14 Linux/6.2.0-26-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bafa6b8ff6fc836779ab9aa64f405da73524b9fd606ea155cfa1c376ed366e0b |
|
MD5 | 2f2443af2fa75dffb95feb8dfad7c6fd |
|
BLAKE2b-256 | cb6baa9abf8fd4fc4d921420f8e7dcb8a5bfe91b0cc9913915038134dcc51fd7 |
File details
Details for the file kodali-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: kodali-0.2.1-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.9.14 Linux/6.2.0-26-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 170ee49a2b1c1102444445899e26c45f34e84d84e665dcca0ceee57c016ba874 |
|
MD5 | 32177c73fce0a5984c553d14802e6a3c |
|
BLAKE2b-256 | c815055a49160e9d0afe93cd270ebc091f3183fa372a73e2fb86574836c5c08e |