Connecting Transfromers on HuggingfaceHub with CTranslate2.
Project description
hf_hub_ctranslate2
Connecting Transformers on HuggingfaceHub with Ctranslate2 - a small utility for keeping tokenizer and model around Huggingface Hub.
Usage:
PYPI Install
pip install hf-hub-ctranslate2
Decoder-only Transformer:
# download ctranslate.Generator repos from Huggingface Hub (GPT-J, ..)
from hf_hub_ctranslate2 import TranslatorCT2fromHfHub, GeneratorCT2fromHfHub
model_name_1="michaelfeil/ct2fast-pythia-160m"
model = GeneratorCT2fromHfHub(
# load in int8 on CPU
model_name_or_path=model_name_1, device="cpu", compute_type="int8"
)
outputs = model.generate(
text=["How do you call a fast Flan-ingo?", "User: How are you doing?"]
# add arguments specifically to ctranslate2.Generator here
)
Encoder-Decoder:
from hf_hub_ctranslate2 import TranslatorCT2fromHfHub
# download ctranslate.Translator repos from Huggingface Hub (T5, ..)
model_name_2 = "michaelfeil/ct2fast-flan-alpaca-base"
model = TranslatorCT2fromHfHub(
# load in int8 on CUDA
model_name_or_path=model_name_2, device="cuda", compute_type="int8_float16"
)
outputs = model.generate(
text=["How do you call a fast Flan-ingo?", "Translate to german: How are you doing?"],
# use arguments specifically to ctranslate2.Translator below:
min_decoding_length=8,
max_decoding_length=16,
max_input_length=512,
beam_size=3
)
print(outputs)
Encoder-Decoder for multilingual translations (m2m-100):
from hf_hub_ctranslate2 import MultiLingualTranslatorCT2fromHfHub
model = MultiLingualTranslatorCT2fromHfHub(
model_name_or_path="michaelfeil/ct2fast-m2m100_418M", device="cpu", compute_type="int8",
tokenizer=AutoTokenizer.from_pretrained(f"facebook/m2m100_418M")
)
outputs = model.generate(
["How do you call a fast Flamingo?", "Wie geht es dir?"],
src_lang=["en", "de"],
tgt_lang=["de", "fr"]
)
Encoder-only Sentence Transformers
Feel free to try out a new repo, using CTranslate2 for vector-embeddings: https://github.com/michaelfeil/infinity
from hf_hub_ctranslate2 import CT2SentenceTransformer
model_name_pytorch = "intfloat/e5-small"
model = CT2SentenceTransformer(
model_name_pytorch, compute_type="int8", device="cuda",
)
embeddings = model.encode(
["I like soccer", "I like tennis", "The eiffel tower is in Paris"],
batch_size=32,
convert_to_numpy=True,
normalize_embeddings=True,
)
print(embeddings.shape, embeddings)
scores = (embeddings @ embeddings.T) * 100
Encoder-only -> no longer recommended
from hf_hub_ctranslate2 import EncoderCT2fromHfHub
model_name = "michaelfeil/ct2fast-e5-small"
model = EncoderCT2fromHfHub(
# load in int8 on CUDA
model_name_or_path=model_name,
device="cuda",
compute_type="int8_float16",
)
outputs = model.generate(
text=["I like soccer", "I like tennis", "The eiffel tower is in Paris"],
max_length=64,
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hf_hub_ctranslate2-2.13.1.tar.gz.
File metadata
- Download URL: hf_hub_ctranslate2-2.13.1.tar.gz
- Upload date:
- Size: 10.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3d41d859fdea39a745a1ae4f754f24c124d9d96b0dd415317ee71141042faee
|
|
| MD5 |
12a6c7c6e5dd7b0162ebb5c06c700dcf
|
|
| BLAKE2b-256 |
b9bb2f1850ca94b95dee5bb0221d85d93c99196a8452e40b354b0d9562ecb4ef
|
File details
Details for the file hf_hub_ctranslate2-2.13.1-py3-none-any.whl.
File metadata
- Download URL: hf_hub_ctranslate2-2.13.1-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
389ed6102f18d3a84cb466bddd07e1e5e532bc91ddc47d3639c9eb11cdd43a86
|
|
| MD5 |
ac75293a5a9978d779168647b8342c78
|
|
| BLAKE2b-256 |
a73c24980d9f448fc04e7f6371499d6cf29919238c7566f34000604074351e89
|