Easy fine-tuning for BERT models
Project description
bert-for-sequence-classification
Pipeline for easy fine-tuning of BERT architecture for sequence classification
Quick Start
Installation
- Install the library
pip install bert-for-sequence-classification
- If you want to train you model on GPU, please install pytorch version compatible with your device.
To find the version compatible with the cuda installed on your GPU, check
Pytorch website.
You can learn CUDA version installed on your device by typing nvidia-smi
in console or
!nvidia-smi
in a notebook cell.
CLI Use
bert-clf-train --path_to_config <path to yaml file>
Example config file can be found here
Jupyter notebook
Example notebook can be found here
Inference mode
When using your trained model for inference it depends on how you saved your model
if path_to_state_dict in config is equal to false, then if you have the library installed:
import torch
import pandas as pd
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.load(
"path_to_saved_model", map_location=device
)
model.eval()
df = pd.read_csv("path_to_some_df")
df["target_column"] = df["text_column"].apply(model.predict)
Otherwise:
import torch
import json
import pandas as pd
from bert_clf.src.models.BertCLF import BertCLF
from transformers import AutoModel, AutoTokenizer
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained(
pretrained_model_name_or_path="pretrained_model_name_or_path"
)
model_bert = AutoModel.from_pretrained(
pretrained_model_name_or_path="pretrained_model_name_or_path"
).to(device)
id2label = json.load(open("path/to/saved/mapper")) # mapper is saved with the state dict
model = BertCLF(
pretrained_model=model_bert,
tokenizer=tokenizer,
id2label=id2label,
dropout="some number",
device=device
)
model.load_state_dict(
torch.load(
"path_to_state_dict", map_location=device
),
strict=False
)
model.eval()
df = pd.read_csv("path_to_some_df")
df["target_column"] = df["text_column"].apply(model.predict)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file bert-for-sequence-classification-0.1.1.tar.gz
.
File metadata
- Download URL: bert-for-sequence-classification-0.1.1.tar.gz
- Upload date:
- Size: 12.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2c39fa648c7d1523e97a63dfeb907d74ba5ed47d38f9880a217eeff232086cb6 |
|
MD5 | 3b27782992f1f5976ccbde5905daf375 |
|
BLAKE2b-256 | 75e53e18689aad35038b563afea53a9d255c8343325b23c48a2c3f0c0ba64582 |
File details
Details for the file bert_for_sequence_classification-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: bert_for_sequence_classification-0.1.1-py3-none-any.whl
- Upload date:
- Size: 17.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9931feb38746ac4132ee8ccd17c448f83de42534d9ef575c3c5eb03d43cc9d01 |
|
MD5 | 337229c23ad7ad0dced11aa3fc71d21b |
|
BLAKE2b-256 | dfd6e728eb2b59c8a86a0a143c24660a6c8115aaac7954b032c00fe527a9af41 |