Skip to main content

Efficient implementations of sequence models with fast performance

Project description

FastSeq

Introduction

FastSeq provides efficient implementations of the popular sequence models with high performance for text generation, summarization, and translation tasks. It can automatically optimize the performance of the pupular NLP toolkits (e.g. FairSeq) by simply import fastseq.

Supported Models

Supported models in fairseq

Supported models in HuggingFace-Transformers

Benchmarks

ProphetNet

  • CNN daily mail val data, NVIDIA-V100-16GB

    BatchSize 32 64 128
    prophetnet 2.7 samples/s 3.1 samples/s OOM
    prophetnet + fastseq 5.5 samples/s 8.4 samples/s 10.3 samples/s

with setting:

$ fastseq-generate-for-fairseq \
      cnn_dm_bert.1k/len-1024.bin \
      --path prophetnet/model.pt \
      --fp16 \
      --task translation_prophetnet \
      --batch-size BATCH_SIZE \
      --beam 4 \
      --num-workers 4 \
      --min-len 55 \
      --max-len-b 140 \
      --no-repeat-ngram-size 3 \
      --lenpen 2.0 \
      --remove-bpe \
      --gen-subset valid \

BART from Fairseq

  • CNN daily mail val data, NVIDIA-V100-16GB

    BatchSize 32 64 128
    fairseq-0.9.0 2.7 samples/s OOM OOM
    above + fastseq 9.0 samples/s 12.5 samples/s 14.5 samples/s

with setting:

$ fastseq-generate-for-fairseq \
      cnn_dm.1k/len-1024.bin \
      --path bart.large.cnn/model.pt \
      --fp16 \
      --task translation \
      --batch-size BATCH_SIZE \
      --gen-subset valid \
      --truncate-source  \
      --bpe gpt2 \
      --beam 4 \
      --num-workers 4 \
      --min-len 55 \
      --max-len-b 140 \
      --no-repeat-ngram-size 3 \
      --lenpen 2.0

To get the baseline fairseq's speed number, replace fastseq-generate-for-fairseq by fairseq-generate.

BART from Transformers

  • CNN daily mail val data, NVIDIA-V100-16GB

    BatchSize 32 64 128
    transformers-3.0.2 3.4 samples/s OOM OOM
    above + fastseq 5.2 samples/s 6.2 samples/s 6.4 samples/s
    transformers-2.11.0 2.5 samples/s OOM OOM
    above + fastseq 4.4 samples/s 5.3 samples/s >5.3 samples/s

(numbers for 2.11.0 needs to be updated based on docker env.)

with setting:

$ fastseq-generate-for-transformers \
    facebook/bart-large-cnn \
    cnn_dm.1k/val.source \
    out.summary \
    --reference_path cnn_dm/val.target \
    --device cuda \
    --bs 128 \
    --fp16 \
    --score_path out.score \
    --task summarization

To get the baseline transformers' speed number, we can either add option --without_fastseq_opt or use tool provided in Transformers GitHub repository.

WMT from Fairseq

  • WMT16 En-De model

    BatchSize 256 512 1024
    fairseq-0.9.0 84 samples/s OOM OOM
    above + fastseq 129 samples/s 131 samples/s 135 samples/s

with setting:

$ fastseq-generate-for-fairseq \
      wmt14.en-fr.joined-dict.newstest2014/ \
      --path wmt14.en-fr.joined-dict.transformer/model.pt \
      --beam 4 \
      --lenpen 0.6 \
      --remove-bpe \
      --batch-size 32

To get the fairseq's speed number, replace fastseq-generate-for-fairseq by fairseq-generate.

Installation

Requirements

If you use fairseq or transformers, you only need to install one of them. If you use both, you need to install both.

Python package

fastseq Python package can be directly installed with pip using

$ pip install fastseq

Install from the source

$ git clone https://github.com/microsoft/fastseq
$ cd fastseq
$ pip install --editable ./

Usage

Example

Only one line of code change is needed to use the optimizations provided by FastSeq.

# import fastseq at the beginning of your program
import fastseq
import torch

# Download bart.large.cnn
bart = torch.hub.load('pytorch/fairseq', 'bart.large.cnn')

bart.cuda()  # use GPU
bart.eval()  # disable dropout for evaluation
bart.half()

slines = ['FastSeq provides efficient implementations of the popular sequence models. Please visit https://github.com/microsoft/fastseq for more details.']

hypotheses = bart.sample(
    slines, beam=4, lenpen=2.0, max_len_b=140, min_len=55, no_repeat_ngram_size=3)

print(hypotheses)

Command line tool for fairseq models

Example

$ fastseq-generate-for-fairseq \
    cnn_dnn/bin \
    --path bart.large.cnn/model.pt \
    --fp16 \
    --task translation \
    --batch-size 128 \
    --gen-subset valid \
    --truncate-source  \
    --bpe gpt2 \
    --beam 4 \
    --num-workers 4 \
    --min-len 55 \
    --max-len-b 140 \
    --no-repeat-ngram-size 3 \
    --lenpen 2.0

Command line tool for transformers models

Example

$ fastseq-generate-for-transformers \
    facebook/bart-large-cnn \
    cnn_dm/val.source \
    out.summary \
    --reference_path cnn_dm/val.target \
    --device cuda \
    --bs 128 \
    --fp16 \
    --score_path out.score \
    --task summarization

Run tests

# run a single test.
$ python tests/optimizer/fairseq/test_fairseq_optimizer.py

# run benchmark.
$ python tests/optimizer/fairseq/benchmark_fairseq_optimizer.py

# run all the tests.
$ python -m unittest discover -s tests/ -p '*.py'

# run all the benchmarks.
$ cd benchmarks && bash run_all_benchmarks.sh

Build

# build package
$ python setup.py sdist bdist_wheel

Code Style

Python coding style

Changes to Python code should conform to PEP 8. yapf can be used to help format the python code, and use pylint to check your Python changes.

# format the code by yapf
$ yapf --style pep8 -i -r PYTHON_FILE/PACKAGE

# run pylint check
$ pylint --rcfile=.pylintrc  PYTHON_FILE/PACKAGE

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastseq-0.0.3.tar.gz (64.4 kB view details)

Uploaded Source

Built Distribution

fastseq-0.0.3-py3-none-any.whl (75.3 kB view details)

Uploaded Python 3

File details

Details for the file fastseq-0.0.3.tar.gz.

File metadata

  • Download URL: fastseq-0.0.3.tar.gz
  • Upload date:
  • Size: 64.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.9

File hashes

Hashes for fastseq-0.0.3.tar.gz
Algorithm Hash digest
SHA256 e1d0c9e97e01750a0d23ed46d866e0b6174a6a3e34fbbbd3de1d25465b4d362c
MD5 c9a2366f5a94fdec0eb5d9111a09c8af
BLAKE2b-256 3b54af32fac50f6c3f85fd67b455c98facc6f2f5fe404466a95a7939b7cb5b28

See more details on using hashes here.

File details

Details for the file fastseq-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: fastseq-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 75.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.6.9

File hashes

Hashes for fastseq-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 0ad5ba8945a651c60b1921b083c9028d2a6eb58038fd097d836e2f7f6768776c
MD5 fd290c073fc85a2d90995e4f8fabadf7
BLAKE2b-256 4e02f2e05af51e1fdf197c778cd3dac3f957f0391f09ded3f871ff774a386abe

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page