Skip to main content

Train 🤗-transformers models with Poutyne.

Project description

poutyne-transformers

Train 🤗-transformers models with Poutyne.

Installation

pip install poutyne-transformers

Example

import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from datasets import load_dataset
from torch.utils.data import DataLoader
from torch import optim
from poutyne import Model, Accuracy
from poutyne_transformers import (
    TransformerCollator,
    model_loss,
    ModelWrapper,
    MetricWrapper,
)

print("Loading model & tokenizer.")
transformer = AutoModelForSequenceClassification.from_pretrained(
    "distilbert-base-cased", num_labels=2, return_dict=True
)
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-cased")

print("Loading & preparing dataset.")
dataset = load_dataset("imdb")
dataset = dataset.map(
    lambda entry: tokenizer(
        entry["text"], add_special_tokens=True, padding="max_length", truncation=True
    ),
    batched=True,
)
dataset = dataset.remove_columns(["text"])
dataset = dataset.shuffle()
dataset.set_format("torch")

collate_fn = TransformerCollator(y_keys="labels")
train_dataloader = DataLoader(dataset["train"], batch_size=16, collate_fn=collate_fn)
test_dataloader = DataLoader(dataset["test"], batch_size=16, collate_fn=collate_fn)

print("Preparing training.")
wrapped_transformer = ModelWrapper(transformer)
optimizer = optim.AdamW(wrapped_transformer.parameters(), lr=5e-5)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
accuracy = MetricWrapper(Accuracy(), pred_key="logits")
model = Model(
    wrapped_transformer,
    optimizer,
    loss_function=model_loss,
    batch_metrics=[accuracy],
    device=device,
)

print("Starting training.")
model.fit_generator(train_dataloader, test_dataloader, epochs=1)

You can also create models with a custom architecture using torch.nn.Sequential class:

from torch import nn
from transformers import AutoModel
from poutyne import Lambda
from poutyne_transformers import ModelWrapper

...

transformer = AutoModel.from_pretrained(
    "distilbert-base-cased", output_hidden_states=True
)

custom_model = nn.Sequential(
    ModelWrapper(transformer),
    # Use distilberts [CLS] token for classification.
    Lambda(lambda outputs: outputs["last_hidden_state"][:, 0, :]),
    nn.Linear(in_features=transformer.config.hidden_size, out_features=1),
    Lambda(lambda out: out.reshape(-1)),
)

...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

poutyne-transformers-0.1.0.4.1.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

poutyne_transformers-0.1.0.4.1-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file poutyne-transformers-0.1.0.4.1.tar.gz.

File metadata

  • Download URL: poutyne-transformers-0.1.0.4.1.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.10 CPython/3.9.7 Darwin/20.6.0

File hashes

Hashes for poutyne-transformers-0.1.0.4.1.tar.gz
Algorithm Hash digest
SHA256 01c929c72d9952f74715bc3f48bd9393313821682ec9da31a60544c93fda042a
MD5 e1edd9953b03871c3bd253affb51c746
BLAKE2b-256 9a7a59561c4a6c3c54841eafa6f61c06462c2b35674ba7c6fed7175d7dfa190f

See more details on using hashes here.

File details

Details for the file poutyne_transformers-0.1.0.4.1-py3-none-any.whl.

File metadata

File hashes

Hashes for poutyne_transformers-0.1.0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 758551dbedd454056054a7e6839e1154dd033ad539d9d06de8b8c90b7f4a0b81
MD5 ef39acb4cd58202ba630133ed28b4e98
BLAKE2b-256 1b23834edbd39594594ce81153f2d47ee9a96fd379acb681446397ace00e19f6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page