Skip to main content

PyTorch Lightning Transformers.

Project description

Flexible interface for high performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.


What is Lightning TransfomersUsing Lightning TransformersDocsCommunityLicense


Installation

Option 1: from PyPI

pip install lightning-transformers
# instead of: `python train.py ...`, run with:
pl-transformers-train ...

Option 2: from source

git clone https://github.com/PyTorchLightning/lightning-transformers.git
cd lightning-transformers
python train.py ...
# the `pl-transformers-train` endpoint is also available!

What is Lightning-Transformers

Lightning Transformers offers a flexible interface for training and fine-tuning SOTA Transformer models using the PyTorch Lightning Trainer.

  • Train using HuggingFace Transformers models and datasets with Lightning custom Callbacks, Loggers, Accelerators and high performance scaling.
  • Seamless Memory and Speed Optimizations such as DeepSpeed ZeRO or FairScale Sharded Training with no code changes.
  • Powerful config composition backed by Hydra - Easily swap out models, optimizers, schedulers and many more configurations without touching the code.
  • Transformer Task Abstraction for Rapid Research & Experimentation - Built from the ground up to be task agnostic, the library supports creating transformer tasks across all modalities with little friction.

Lightning Transformers tasks allow you to train models using HuggingFace Transformer models and datasets, use Hydra to hotswap models, optimizers or schedulers and leverage all the advances features that Lightning has to offer, including custom Callbacks, Loggers, Accelerators and high performance scaling with minimal changes.

Using Lightning-Transformers

Grid is our platform for training models at scale on the cloud! Sign up here.

Task Quick Commands Run
Language Modeling python train.py task=nlp/language_modeling dataset=nlp/language_modeling/wikitext trainer.gpus=1 training.batch_size=8 Grid
Multiple Choice python train.py task=nlp/multiple_choice dataset=nlp/multiple_choice/race trainer.gpus=1 Grid
Question Answering python train.py task=nlp/question_answering dataset=nlp/question_answering/squad trainer.gpus=1 Grid
Summarization python train.py task=nlp/summarization dataset=nlp/summarization/xsum trainer.gpus=1 Grid
Text Classification python train.py task=nlp/text_classification dataset=nlp/text_classification/emotion trainer.gpus=1 Grid
Token Classification python train.py task=nlp/token_classification dataset=nlp/token_classification/conll trainer.gpus=1 Grid
Translation python train.py task=nlp/translation dataset=nlp/translation/wmt16 trainer.gpus=1 Grid

Quick recipes

Train bert-base-cased on the CARER emotion dataset using the Text Classification task.

python train.py \
    task=nlp/text_classification \
    dataset=nlp/text_classification/emotion
See the composed Hydra config used under-the-hood
optimizer:
  _target_: torch.optim.AdamW
  lr: ${training.lr}
  weight_decay: 0.001
scheduler:
  _target_: transformers.get_linear_schedule_with_warmup
  num_training_steps: -1
  num_warmup_steps: 0.1
training:
  run_test_after_fit: true
  lr: 5.0e-05
  output_dir: .
  batch_size: 16
  num_workers: 16
trainer:
  _target_: pytorch_lightning.Trainer
  logger: true
  checkpoint_callback: true
  callbacks: null
  default_root_dir: null
  gradient_clip_val: 0.0
  process_position: 0
  num_nodes: 1
  num_processes: 1
  gpus: null
  auto_select_gpus: false
  tpu_cores: null
  log_gpu_memory: null
  progress_bar_refresh_rate: 1
  overfit_batches: 0.0
  track_grad_norm: -1
  check_val_every_n_epoch: 1
  fast_dev_run: false
  accumulate_grad_batches: 1
  max_epochs: 1
  min_epochs: 1
  max_steps: null
  min_steps: null
  limit_train_batches: 1.0
  limit_val_batches: 1.0
  limit_test_batches: 1.0
  val_check_interval: 1.0
  flush_logs_every_n_steps: 100
  log_every_n_steps: 50
  accelerator: null
  sync_batchnorm: false
  precision: 32
  weights_summary: top
  weights_save_path: null
  num_sanity_val_steps: 2
  truncated_bptt_steps: null
  resume_from_checkpoint: null
  profiler: null
  benchmark: false
  deterministic: false
  reload_dataloaders_every_epoch: false
  auto_lr_find: false
  replace_sampler_ddp: true
  terminate_on_nan: false
  auto_scale_batch_size: false
  prepare_data_per_node: true
  plugins: null
  amp_backend: native
  amp_level: O2
  move_metrics_to_cpu: false
task:
  _recursive_: false
  backbone: ${backbone}
  optimizer: ${optimizer}
  scheduler: ${scheduler}
  _target_: lightning_transformers.task.nlp..text_classification.TextClassificationTransformer
  downstream_model_type: transformers.AutoModelForSequenceClassification
dataset:
  cfg:
    batch_size: ${training.batch_size}
    num_workers: ${training.num_workers}
    dataset_name: emotion
    dataset_config_name: null
    train_file: null
    validation_file: null
    test_file: null
    train_val_split: null
    max_samples: null
    cache_dir: null
    padding: max_length
    truncation: only_first
    preprocessing_num_workers: 1
    load_from_cache_file: true
    max_length: 128
    limit_train_samples: null
    limit_val_samples: null
    limit_test_samples: null
  _target_: lightning_transformers.task.nlp.text_classification.TextClassificationDataModule
experiment_name: ${now:%Y-%m-%d}_${now:%H-%M-%S}
log: false
ignore_warnings: true
tokenizer:
  _target_: transformers.AutoTokenizer.from_pretrained
  pretrained_model_name_or_path: ${backbone.pretrained_model_name_or_path}
  use_fast: true
backbone:
  pretrained_model_name_or_path: bert-base-cased

Swap the backbone to RoBERTa and the optimizer to RMSprop:

python train.py \
    task=nlp/text_classification \
    dataset=nlp/text_classification/emotion
    backbone.pretrained_model_name_or_path=roberta-base
    optimizer=rmsprop
See the changed Hydra config under-the-hood
 optimizer:
-  _target_: torch.optim.AdamW
+  _target_: torch.optim.RMSprop
   lr: ${training.lr}
-  weight_decay: 0.001
 scheduler:
   _target_: transformers.get_linear_schedule_with_warmup
   num_training_steps: -1
....
tokenizer:
   pretrained_model_name_or_path: ${backbone.pretrained_model_name_or_path}
   use_fast: true
 backbone:
-  pretrained_model_name_or_path: bert-base-cased
+  pretrained_model_name_or_path: roberta-base

Enable Sharded Training.

python train.py \
    task=nlp/text_classification \
    dataset=nlp/text_classification/emotion \
    trainer=ddp \
    trainer/plugins=sharded
See the changed Hydra config under-the-hood Without the need to modify any code, the config updated automatically for sharded training:
optimizer:
   _target_: torch.optim.AdamW
   lr: ${training.lr}
trainer:
   process_position: 0
   num_nodes: 1
   num_processes: 1
-  gpus: null
+  gpus: 1
   auto_select_gpus: false
   tpu_cores: null
   log_gpu_memory: null
   ...
   val_check_interval: 1.0
   flush_logs_every_n_steps: 100
   log_every_n_steps: 50
-  accelerator: null
+  accelerator: ddp
   sync_batchnorm: false
-  precision: 32
+  precision: 16
   weights_summary: top
   weights_save_path: null
   num_sanity_val_steps: 2
   ....
   terminate_on_nan: false
   auto_scale_batch_size: false
   prepare_data_per_node: true
-  plugins: null
+  plugins:
+    _target_: pytorch_lightning.plugins.DDPShardedPlugin
   amp_backend: native
   amp_level: O2
   move_metrics_to_cpu: false
tokenizer:
   pretrained_model_name_or_path: ${backbone.pretrained_model_name_or_path}
   use_fast: true
 backbone:
   pretrained_model_name_or_path: bert-base-cased

Enable DeepSpeed ZeRO Training.

python train.py \
    task=nlp/text_classification \
    dataset=nlp/text_classification/emotion \
    trainer=ddp \
    trainer/plugins=deepspeed
See the changed Hydra config under-the-hood Without the need to modify any code, the config updated automatically for DeepSpeed:
optimizer:
   _target_: torch.optim.AdamW
   lr: ${training.lr}
trainer:
   process_position: 0
   num_nodes: 1
   num_processes: 1
-  gpus: null
+  gpus: 1
   auto_select_gpus: false
   tpu_cores: null
   log_gpu_memory: null
   ...
   val_check_interval: 1.0
   flush_logs_every_n_steps: 100
   log_every_n_steps: 50
-  accelerator: null
+  accelerator: ddp
   sync_batchnorm: false
-  precision: 32
+  precision: 16
   ...
-  plugins: null
+  plugins:
+    _target_: pytorch_lightning.plugins.DeepSpeedPlugin
+    stage: 2
+    cpu_offload: true
   amp_backend: native
   amp_level: O2
   move_metrics_to_cpu: false
...

Train with a pre-trained t5-base backbone, on the XSUM dataset using the Summarization task.

python train.py \
    task=nlp/summarization \
    dataset=nlp/summarization/xsum \
    backbone.pretrained_model_name_or_path=t5-base

Train with a pre-trained mt5-base backbone, on the WMT16 dataset using the Translation task with 2 GPUs.

python train.py \
    task=nlp/translation \
    dataset=nlp/translation/wmt16 \
    backbone.pretrained_model_name_or_path=google/mt5-base \
    trainer.gpus=2

Custom Files & Datasets

You can train, validate and test Lightning transformers tasks on your own data files, and you can extend datasets for custom processing and your own tasks.

How to train, validate and test on custom files

How to extend datasets

Custom Tasks

Extending the Language Modeling Task

Contribute

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

Community

For help or questions, join our huge community on Slack!

License

Please observe the Apache 2.0 license that is listed in this repository. In addition, the Lightning framework is Patent Pending.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lightning-transformers-0.1.0.tar.gz (52.8 kB view details)

Uploaded Source

Built Distribution

lightning_transformers-0.1.0-py3-none-any.whl (99.5 kB view details)

Uploaded Python 3

File details

Details for the file lightning-transformers-0.1.0.tar.gz.

File metadata

  • Download URL: lightning-transformers-0.1.0.tar.gz
  • Upload date:
  • Size: 52.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for lightning-transformers-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a2e9a57bd384cb8e58a15ae6cb959819402e89adda78a2bc8b082107e4412ee5
MD5 dfbbd8361af5a8c9409d964042be4ec8
BLAKE2b-256 df5b7b54b4995593f604f748041dfc762bad05847e5201ed02dff38cf1139870

See more details on using hashes here.

File details

Details for the file lightning_transformers-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: lightning_transformers-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 99.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for lightning_transformers-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e54ef32248b19db7e55c4161e169e8e2616532ca8807acf963085fb7054f44be
MD5 45f72337fe2338abcd0e1f249e440b64
BLAKE2b-256 0dc35669ad7d365978be267985d5e11232aa585f1a032bae9b4fdaf4ddd5f139

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page