Skip to main content

Accelerated nlp pipelines using Transformers, Optimum and ONNX Runtime

Project description

Optimum Transformers

Tests License PyPI

Accelerated NLP pipelines for fast inference 🚀 on CPU and GPU. Built with 🤗Transformers, Optimum and ONNX runtime.

Installation:

With PyPI:

pip install optimum-transformers

Or directly from GitHub:

pip install git+https://github.com/AlekseyKorshuk/optimum-transformers

Usage:

The pipeline API is similar to transformers pipeline with just a few differences which are explained below.

Just provide the path/url to the model, and it'll download the model if needed from the hub and automatically create onnx graph and run inference.

from optimum_transformers import pipeline

# Initialize a pipeline by passing the task name and 
# set onnx to True (default value is also True)
nlp = pipeline("sentiment-analysis", use_onnx=True)
nlp("Transformers and onnx runtime is an awesome combo!")
# [{'label': 'POSITIVE', 'score': 0.999721109867096}]  

Or provide a different model using the model argument.

from optimum_transformers import pipeline

nlp = pipeline("question-answering", model="deepset/roberta-base-squad2", use_onnx=True)
nlp(question="What is ONNX Runtime ?",
         context="ONNX Runtime is a highly performant single inference engine for multiple platforms and hardware")
# {'answer': 'highly performant single inference engine for multiple platforms and hardware', 'end': 94,
# 'score': 0.751201868057251, 'start': 18}
from optimum_transformers import pipeline

nlp = pipeline("ner", model="mys/electra-base-turkish-cased-ner", use_onnx=True, optimize=True,
                    grouped_entities=True)
nlp("adana kebap ülkemizin önemli lezzetlerinden biridir.")
# [{'entity_group': 'B-food', 'score': 0.869149774312973, 'word': 'adana kebap'}]

Set use_onnx to False for standard torch inference. Set optimize to True for quantize with ONNX. ( set use_onnx to True)

Supported pipelines

You can create Pipeline objects for the following down-stream tasks:

  • feature-extraction: Generates a tensor representation for the input sequence
  • ner and token-classification: Generates named entity mapping for each word in the input sequence.
  • sentiment-analysis: Gives the polarity (positive / negative) of the whole input sequence. Can be used for any text classification model.
  • question-answering: Provided some context and a question referring to the context, it will extract the answer to the question in the context.
  • text-classification: Classifies sequences according to a given number of classes from training.
  • zero-shot-classification: Classifies sequences according to a given number of classes directly in runtime.
  • fill-mask: The task of masking tokens in a sequence with a masking token, and prompting the model to fill that mask with an appropriate token.
  • text-generation: The task of generating text according to the previous text provided.

Calling the pipeline for the first time loads the model, creates the onnx graph, and caches it for future use. Due to this, the first load will take some time. Subsequent calls to the same model will load the onnx graph automatically from the cache.

Benchmarks

Note: For some reason, onnx is slow on colab notebook, so you won't notice any speed-up there. Benchmark it on your own hardware.

Check our example of benchmarking: example.

For detailed benchmarks and other information refer to this blog post and notebook.

Note: These results were collected on my local machine. So if you have high performance machine to benchmark, please contact me.

Benchmark sentiment-analysis pipeline

Benchmark zero-shot-classification pipeline

Benchmark token-classification pipeline

Benchmark question-answering pipeline

Benchmark fill-mask pipeline

About

Built by Aleksey Korshuk

Follow

Follow

Follow

🚀 If you want to contribute to this project OR create something cool together — contact me: link

Star this repository:

GitHub stars

Resources

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

optimum_transformers-0.2.1.tar.gz (55.4 kB view details)

Uploaded Source

Built Distribution

optimum_transformers-0.2.1-py3-none-any.whl (59.8 kB view details)

Uploaded Python 3

File details

Details for the file optimum_transformers-0.2.1.tar.gz.

File metadata

  • Download URL: optimum_transformers-0.2.1.tar.gz
  • Upload date:
  • Size: 55.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for optimum_transformers-0.2.1.tar.gz
Algorithm Hash digest
SHA256 e5e194da3c8770f35f9e173d7285f4727b88ec5af65a6e584a56f2b80ff1c224
MD5 1c3c476ee5b1a3d7142eb65b45bc29f2
BLAKE2b-256 57ba65cd0927da236db40e5a8b33a006add8c78a0111668f711e10b7def73395

See more details on using hashes here.

File details

Details for the file optimum_transformers-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for optimum_transformers-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 44615687e8b5110f3090bf9888de25066d4436f14577913f79bc2073ef5c78a9
MD5 90d70539ccd6528ea1756fc7792df826
BLAKE2b-256 c3c9e87f5134b3dd6cc7c7216055b3f62d374bb0f8d83dbbc965c4bdedb1b260

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page