This repository contains code to run faster sentence-transformers. Simply, faster, sentence-transformers.
Project description
Fast Sentence Transformers
This repository contains code to run faster feature extractors using tools like quantization, optimization and ONNX
. Just run your model much faster, while using less of memory. There is not much to it!
Phillip Schmid: "We successfully quantized our vanilla Transformers model with Hugging Face and managed to accelerate our model latency from 25.6ms to 12.3ms or 2.09x while keeping 100% of the accuracy on the stsb dataset. But I have to say that this isn't a plug and play process you can transfer to any Transformers model, task or dataset.""
Install
pip install fast-sentence-transformers
Or, for GPU support:
pip install fast-sentence-transformers[gpu]
Quickstart
from fast_sentence_transformers import FastSentenceTransformer as SentenceTransformer
# use any sentence-transformer
encoder = SentenceTransformer("all-MiniLM-L6-v2", device="cpu")
encoder.encode("Hello hello, hey, hello hello")
encoder.encode(["Life is too short to eat bad food!"] * 2)
Benchmark
Non-exact, indicative benchmark for speed an memory usage with smaller and larger model on sentence-transformers
model | Type | default | ONNX | ONNX+quantized | ONNX+GPU |
---|---|---|---|---|---|
paraphrase-albert-small-v2 | memory | 1x | 1x | 1x | 1x |
speed | 1x | 2x | 5x | 20x | |
paraphrase-multilingual-mpnet-base-v2 | memory | 1x | 1x | 4x | 4x |
speed | 1x | 2x | 5x | 20x |
Shout-Out
This package heavily leans on https://www.philschmid.de/optimize-sentence-transformers.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fast_sentence_transformers-0.5.tar.gz
.
File metadata
- Download URL: fast_sentence_transformers-0.5.tar.gz
- Upload date:
- Size: 5.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d6329ca7240bcb531b112b8d37684b002d4258e2a62fbf450543bf790a102cb4 |
|
MD5 | 2c4b98f0900e718d51fe688de2d7202c |
|
BLAKE2b-256 | 6ea97ea7990ebbe9628bb25bca180a957954fd43cf52cf41cc293b4408585529 |
File details
Details for the file fast_sentence_transformers-0.5-py3-none-any.whl
.
File metadata
- Download URL: fast_sentence_transformers-0.5-py3-none-any.whl
- Upload date:
- Size: 6.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f4accf68b65061c54e071813fb5df45878e73e1d792b6efb3427f148e649baca |
|
MD5 | 214f5a6602064136d67ef9b38ff08a75 |
|
BLAKE2b-256 | af3a4e6501279845b3623d3318a723bddebf64fc38ce94ac71e1ffe6f2c68c19 |