Korean Easy Data Augmentation Package
Project description
KoEDA
Easy Data Augmentation for Korean
Prerequisites
- python >= 3.6
Installation
This repository is tested on Python 3.6 - 3.9.
KoEDA can be installed using pip as follows:
$ pip install koeda
Quick Start
from koeda import EasyDataAugmentation
EDA = EasyDataAugmentation(
morpheme_analyzer=None, alpha_sr=0.3, alpha_ri=0.3, alpha_rs=0.3, prob_rd=0.3
)
text = "아버지가 방에 들어가신다"
result = EDA(text)
print(result)
# 아버지가 정실에 들어가신다
Augmenters
- EasyDataAugmentation (EDA)
- RandomDeletion (RD)
- RandomInsertion (RI)
- SynonymReplacement (SR)
- RandomSwap (RS)
Usage
- EDA class
EDA = EasyDataAugmentation(
morpheme_analyzer: str = None,
alpha_sr: float = 0.1,
alpha_ri: float = 0.1,
alpha_rs: float = 0.1,
prob_rd: float = 0.1,
):
text = "아버지가방에들어가신다"
# EDA(data: Union[List[str], str], p: List[float] = None, repetition: int = 1)
result = EDA(data=text, p=None, repetition=1)
- The others (RD, RI, SR, RS)
augmenter = Augmenter(morpheme_analyzer: str = None, stopword: bool = False)
text = "아버지가방에들어가신다"
# augmenter(data: Union[List[str], str], p: float = 0.1, repetition: int = 1)
result = augmenter(data=text, p=0.5, repetiion=1)
Reference
Easy Data Augmentation Paper
Easy Data Augmentation Repository
Korean WordNet
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
koeda-0.0.3.tar.gz
(543.8 kB
view hashes)
Built Distribution
koeda-0.0.3-py3-none-any.whl
(564.6 kB
view hashes)