Model hub for transformers.
Project description
Usage Sample ''''''''''''
.. code:: python
from sklearn.model_selection import train_test_split
import torch
from transformers import BertTokenizer
from nlpx.dataset import TextDataset, text_collate
from nlpx.model.wrapper import ClassifyModelWrapper
from transformers_model import AutoCNNTextClassifier, AutoCNNTokenClassifier, \
BertDataset, BertCollator, BertTokenizeCollator
texts = [[str],]
labels = [0, 0, 1, 2, 1...]
pretrained_path = "clue/albert_chinese_tiny"
classes = ['class1', 'class2', 'class3'...]
train_texts, test_texts, y_train, y_test = train_test_split(texts, labels, test_size=0.2)
train_set = TextDataset(train_texts, y_train)
test_set = TextDataset(test_texts, y_test)
################################### TextClassifier ##################################
model = AutoCNNTextClassifier(pretrained_path, len(classes))
wrapper = ClassifyModelWrapper(model, classes)
_ = wrapper.train(train_set, test_set, collate_fn=text_collate)
################################### TokenClassifier #################################
tokenizer = BertTokenizer.from_pretrained(pretrained_path)
##################### BertTokenizeCollator #########################
model = AutoCNNTokenClassifier(pretrained_path, len(classes))
wrapper = ClassifyModelWrapper(model, classes)
_ = wrapper.train(train_set, test_set, collate_fn=BertTokenizeCollator(tokenizer, 256))
##################### BertCollator ##################################
train_tokens = tokenizer.batch_encode_plus(
train_texts,
max_length=256,
padding="max_length",
truncation=True,
return_attention_mask=True,
return_token_type_ids=False,
return_tensors="pt",
)
test_tokens = tokenizer.batch_encode_plus(
test_texts,
max_length=256,
padding="max_length",
truncation=True,
return_attention_mask=True,
return_token_type_ids=False,
return_tensors="pt",
)
train_set = BertDataset(train_tokens, y_train)
test_set = BertDataset(test_tokens, y_test)
model = AutoCNNTokenClassifier(pretrained_path, len(classes))
wrapper = ClassifyModelWrapper(model, classes)
_ = wrapper.train(train_set, test_set, collate_fn=BertCollator())
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file transformers-model-0.2.1.tar.gz.
File metadata
- Download URL: transformers-model-0.2.1.tar.gz
- Upload date:
- Size: 8.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
259be6308fe9aedabeb720cd4707dce79e9702d24038fbf479c108d3baf4a045
|
|
| MD5 |
afc23a6f1c28249f7039ce5914d3af13
|
|
| BLAKE2b-256 |
a75000d3f4d14cebe262afe8392e16eade88cbfa79da6de119479fdf3d06faf9
|