Connecting Transfromers on HuggingfaceHub with CTranslate2.
Project description
hf_hub_ctranslate2
Connecting Transformers on HuggingfaceHub with Ctranslate2 - a small utility for keeping tokenizer and model around Huggingface Hub.
Usage:
PYPI Install
pip install hf-hub-ctranslate2
Decoder-only Transformer:
# download ctranslate.Generator repos from Huggingface Hub (GPT-J, ..)
from hf_hub_ctranslate2 import TranslatorCT2fromHfHub, GeneratorCT2fromHfHub
model_name_1="michaelfeil/ct2fast-pythia-160m"
model = GeneratorCT2fromHfHub(
# load in int8 on CPU
model_name_or_path=model_name_1, device="cpu", compute_type="int8"
)
outputs = model.generate(
text=["How do you call a fast Flan-ingo?", "User: How are you doing?"]
# add arguments specifically to ctranslate2.Generator here
)
Encoder-Decoder:
from hf_hub_ctranslate2 import TranslatorCT2fromHfHub
# download ctranslate.Translator repos from Huggingface Hub (T5, ..)
model_name_2 = "michaelfeil/ct2fast-flan-alpaca-base"
model = TranslatorCT2fromHfHub(
# load in int8 on CUDA
model_name_or_path=model_name_2, device="cuda", compute_type="int8_float16"
)
outputs = model.generate(
text=["How do you call a fast Flan-ingo?", "Translate to german: How are you doing?"],
# use arguments specifically to ctranslate2.Translator below:
min_decoding_length=8,
max_decoding_length=16,
max_input_length=512,
beam_size=3
)
print(outputs)
Encoder-Decoder for multilingual translations (m2m-100):
from hf_hub_ctranslate2 import MultiLingualTranslatorCT2fromHfHub
model = MultiLingualTranslatorCT2fromHfHub(
model_name_or_path="michaelfeil/ct2fast-m2m100_418M", device="cpu", compute_type="int8",
tokenizer=AutoTokenizer.from_pretrained(f"facebook/m2m100_418M")
)
outputs = model.generate(
["How do you call a fast Flamingo?", "Wie geht es dir?"],
src_lang=["en", "de"],
tgt_lang=["de", "fr"]
)
Encoder-only Sentence Transformers
Feel free to try out a new repo, using CTranslate2 for vector-embeddings: https://github.com/michaelfeil/infinity
from hf_hub_ctranslate2 import CT2SentenceTransformer
model_name_pytorch = "intfloat/e5-small"
model = CT2SentenceTransformer(
model_name_pytorch, compute_type="int8", device="cuda",
)
embeddings = model.encode(
["I like soccer", "I like tennis", "The eiffel tower is in Paris"],
batch_size=32,
convert_to_numpy=True,
normalize_embeddings=True,
)
print(embeddings.shape, embeddings)
scores = (embeddings @ embeddings.T) * 100
Encoder-only -> no longer recommended
from hf_hub_ctranslate2 import EncoderCT2fromHfHub
model_name = "michaelfeil/ct2fast-e5-small"
model = EncoderCT2fromHfHub(
# load in int8 on CUDA
model_name_or_path=model_name,
device="cuda",
compute_type="int8_float16",
)
outputs = model.generate(
text=["I like soccer", "I like tennis", "The eiffel tower is in Paris"],
max_length=64,
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
hf_hub_ctranslate2-2.13.1.tar.gz
(10.0 kB
view hashes)
Built Distribution
Close
Hashes for hf_hub_ctranslate2-2.13.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | e3d41d859fdea39a745a1ae4f754f24c124d9d96b0dd415317ee71141042faee |
|
MD5 | 12a6c7c6e5dd7b0162ebb5c06c700dcf |
|
BLAKE2b-256 | b9bb2f1850ca94b95dee5bb0221d85d93c99196a8452e40b354b0d9562ecb4ef |
Close
Hashes for hf_hub_ctranslate2-2.13.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 389ed6102f18d3a84cb466bddd07e1e5e532bc91ddc47d3639c9eb11cdd43a86 |
|
MD5 | ac75293a5a9978d779168647b8342c78 |
|
BLAKE2b-256 | a73c24980d9f448fc04e7f6371499d6cf29919238c7566f34000604074351e89 |