Skip to main content

Kobart model on huggingface transformers

Project description

KoBart-Transformers

  • SKT에서 공개한 KoBart를 편리하게 사용할 수 있게 transformers로 포팅하였습니다.

Install (Optional)

pip install kobart-transformers

Tokenizer

  • PreTrainedTokenizerFast를 이용하여 구현되었습니다.
>>> from kobart_transformers import get_kobart_tokenizer
>>> kobart_tokenizer = get_kobart_tokenizer()
>>> kobart_tokenizer.tokenize("안녕하세요. 한국어 BART 입니다.🤣:)l^o")
['▁안녕하', '세요.', '▁한국어', '▁B', 'A', 'R', 'T', '▁입', '니다.', '🤣', ':)', 'l^o']

Model

  • BartModel을 이용하여 구현되었습니다.
  • BartModel.from_pretrained("hyunwoongko/kobart")와 동일합니다.
>>> from kobart_transformers import get_kobart_model, get_kobart_tokenizer
>>> # from transformers import BartModel

>>> kobart_tokenizer = get_kobart_tokenizer()
>>> model = get_kobart_model()
>>> # model = BartModel.from_pretrained("hyunwoongko/kobart")

>>> inputs = kobart_tokenizer(['안녕하세요.'], return_tensors='pt')
>>> model(inputs['input_ids'])
Seq2SeqModelOutput(last_hidden_state=tensor([[[-0.4488, -4.3651,  3.2349,  ...,  5.8916,  4.0497,  3.5468],
         [-0.4096, -4.6106,  2.7189,  ...,  6.1745,  2.9832,  3.0930]]],
       grad_fn=<TransposeBackward0>), past_key_values=None, decoder_hidden_states=None, decoder_attentions=None, cross_attentions=None, encoder_last_hidden_state=tensor([[[ 0.4624, -0.2475,  0.0902,  ...,  0.1127,  0.6529,  0.2203],
         [ 0.4538, -0.2948,  0.2556,  ..., -0.0442,  0.6858,  0.4372]]],
       grad_fn=<TransposeBackward0>), encoder_hidden_states=None, encoder_attentions=None)

Update Notes

  • 0.1 : pad 토큰이 설정되지 않은 에러를 해결하였습니다.
from kobart import get_kobart_tokenizer
kobart_tokenizer = get_kobart_tokenizer()
kobart_tokenizer("한국어 BART 모델을 소개합니다", truncation=True, padding=True)
{
'input_ids': [[28324, 3, 3, 3, 3], [15085, 264, 281, 283, 24224], [15630, 20357, 3, 3, 3]], 
'token_type_ids': [[0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]], 
'attention_mask': [[1, 0, 0, 0, 0], [1, 1, 1, 1, 1], [1, 1, 0, 0, 0]]
}

Reference

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

kobart_transformers-0.1-py3-none-any.whl (3.0 kB view details)

Uploaded Python 3

File details

Details for the file kobart_transformers-0.1-py3-none-any.whl.

File metadata

  • Download URL: kobart_transformers-0.1-py3-none-any.whl
  • Upload date:
  • Size: 3.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.7.8

File hashes

Hashes for kobart_transformers-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b142ff1a88d635db9b45fd5c7d500d99c1051362950397045940ff87ee34a0d7
MD5 16eb50dcc71c2d1164daf1131eda1cd0
BLAKE2b-256 7c9f442febadd198967afe60c9d924856ab1fc5fd8f1ced693b704cf603a277d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page