Plugin for llm adding support for 🤗 Hugging Face Transformers
Project description
llm-transformers
Plugin for llm adding support for 🤗 Hugging Face Transformers pipeline tasks.
Installation
Install this plugin in the same environment as LLM.
llm install llm-transformers
Some pipelines that accept audio/video inputs require the ffmpeg executable to be installed.
The document-question-answering
pipeline uses pytesseract
which requires the tesseract executable.
Usage
This plugin exposes 🤗 Hugging Face transformers pipelines, the "model" name is transformers
and the pipeline task and/or Hugging Face model are specified as model options, e.g.:
$ llm -m transformers -o task text-generation "A dog has"
$ llm -m transformers -o model facebook/musicgen-small "techno music"
If only -o task <task>
is specified, the default model for that task will be used.
If only -m model <model>
is specified, the task will be inferred from the model.
If both are specified, then the model must be compatible with the task.
Transformers logging is verbose and disabled by default.
Specify the -o verbose True
model option to enable it.
Most 🤗 Hugging Face models are freely accessible, some of them require accepting a license agreement
and using a Hugging Face API token that has access to the model.
You can use llm keys set huggingface
, or set the HF_TOKEN
env var, or use the --key
option to llm
.
$ llm -m transformers -o model meta-llama/Llama-3.2-1B "A dog has"
Error: You are trying to access a gated repo.
Make sure to have access to it at https://huggingface.co/meta-llama/Llama-3.2-1B.
$ llm --key hf_******************** -m transformers -o model meta-llama/Llama-3.2-1B "A dog has"
A dog has been named as the killer of a woman who was found dead in her home.
Some pipelines generate binary (audio, image, video) output, these are written to a temporary file
and the path to the file is returned.
A specific file can be specified with the -o output <path.suffx>
model option.
The suffix specifies the file type (e.g. .png
vs .jpg
etc).
Pipelines can be tuned by passing additional keyword arguments to the pipeline call.
These are specified as a JSON string in the -o kwargs '<json>'
model option.
See the documentation for a specific pipeline for information on additional keyword arguments.
Transformer Pipeline Tasks
You can list available tasks with:
$ llm transformers list-tasks
audio-classification
The audio-classification
task takes an audio URL or path, for example:
$ llm -m transformers -o task audio-classification https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/1.flac
_unknown_ (0.9972336888313293)
left (0.0019911774434149265)
yes (0.0003051063104066998)
down (0.0002108386834152043)
stop (0.00011406492558307946)
automatic-speech-recognition
$ llm -m transformers -o task automatic-speech-recognition https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/1.flac
HE HOPED THERE WOULD BE STEW FOR DINNER TURNIPS AND CARROTS AND BRUISED POTATOES AND FAT MUTTON PIECES TO BE LADLED OUT IN THICK PEPPERED FLOWER FAT AND SAUCE
depth-estimation
The depth-estimation
task accepts an image url or path as input and generates an image file as output:
$ llm -m transformers -o task depth-estimation http://images.cocodataset.org/val2017/000000039769.jpg
/var/folders/b1/1j9kkk053txc5krqbh0lj5t00000gn/T/tmpjvp9uo7x.png
document-question-answering
The document-question-answering
task requires a context
option which is a file or URL to an image:
$ llm -m transformers -o task document-question-answering -o context https://huggingface.co/spaces/impira/docquery/resolve/2359223c1837a7587402bda0f2643382a6eefeab/invoice.png "What is the invoice number?"
us-001
feature-extraction
Not supported.
fill-mask
fill-mask
requires a placeholder in the prompt, thiis is typically <mask>
but is different for different models:
$ llm -m transformers -o task fill-mask "My <mask> is about to explode"
My brain is about to explode (score=0.09140042215585709)
My heart is about to explode (score=0.07742168009281158)
My head is about to explode (score=0.05137857422232628)
My fridge is about to explode (score=0.029346412047743797)
My house is about to explode (score=0.02866862528026104)
image-classification
$ llm -m transformers -o task image-classification https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png
macaw (0.9905233979225159)
African grey, African gray, Psittacus erithacus (0.005603480152785778)
toucan (0.001056905253790319)
sulphur-crested cockatoo, Kakatoe galerita, Cacatua galerita (0.0006811501225456595)
lorikeet (0.0006714339251630008)
image-feature-extraction
Not supported.
image-segmentation
$ llm -m transformers -o task image-segmentation https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png
/var/folders/b1/1j9kkk053txc5krqbh0lj5t00000gn/T/tmp0z8zvd8i.png (bird: 0.999439)
/var/folders/b1/1j9kkk053txc5krqbh0lj5t00000gn/T/tmpik_7r5qn.png (bird: 0.998787)
$ llm -m transformers -o task image-segmentation -o output /tmp/segment.png https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png
/tmp/segment-00.png (bird: 0.999439)
/tmp/segment-01.png (bird: 0.998787)
image-to-image
$ llm -m transformers -o task image-to-image http://images.cocodataset.org/val2017/000000039769.jpg
/var/folders/b1/1j9kkk053txc5krqbh0lj5t00000gn/T/tmpczogz6cb.png
image-to-text
$ llm -m transformers -o task image-to-text https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png
two birds are standing next to each other
mask-generation
Not supported.
object-detection
$ llm -m transformers -o task object-detection https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png
[
{
"score": 0.9966394901275635,
"label": "bird",
"box": {
"xmin": 69,
"ymin": 171,
"xmax": 396,
"ymax": 507
}
},
{
"score": 0.999381422996521,
"label": "bird",
"box": {
"xmin": 398,
"ymin": 105,
"xmax": 767,
"ymax": 507
}
}
]
question-answering
$ llm -m transformers -o task question-answering -o context "My name is Wolfgang and I live in Berlin" "Where do I live?"
Berlin
summarization
Specify additional pipeline keyword args with the kwargs
model option:
$ llm -m transformers -o task summarization "An apple a day, keeps the doctor away"
An apple a day, keeps the doctor away from your doctor away . An apple every day is an apple that keeps you from going to the doctor . The apple is the best way to keep your doctor from getting a doctor's orders, according to the author of The Daily Mail
$ llm -m transformers -o task summarization -o kwargs '{"min_length": 2, "max_length": 7}' "An apple a day, keeps the doctor away"
An apple a day
table-question-answering
table-question-answering
takes a required context
option - a path to a CSV file.
$ cat <<EOF > /tmp/t.csv
> Repository,Stars,Contributors,Programming language
Transformers,36542,651,Python
Datasets,4512,77,Python
Tokenizers,3934,34,"Rust, Python and NodeJS"
> EOF
$ llm -m transformers -o task table-question-answering -o context /tmp/t.csv "How many stars does the transformers repository have?"
AVERAGE > 36542
$ llm -m transformers -o task table-question-answering -o context /tmp/t.csv "How many contributors do all Python language repositories have?"
SUM > 651, 77
text2text-generation
$ llm -m transformers -o task text2text-generation "question: What is 42 ? context: 42 is the answer to life, the universe and everything"
the answer to life, the universe and everything
text-classification
$ llm -m transformers -o task text-classification "We are very happy to show you the 🤗 Transformers library"
POSITIVE (0.9997681975364685)
text-generation
Some text-generation
models can be chatted with.
$ llm -m transformers -o task text-generation "I am going to elect"
I am going to elect the president of Mexico and that president should vote for our president," he said. "That's not very popular. That's not the American way. I would not want voters to accept the fact that that guy's running a
$ llm -m transformers -o task text-generation -o model HuggingFaceH4/zephyr-7b-beta -o kwargs '{"max_new_tokens": 2}' "What is the capital of France? Answer in one word."
Paris
$ llm chat -m transformers -o task text-generation -o model HuggingFaceH4/zephyr-7b-beta -o kwargs '{"max_new_tokens": 25}'
Chatting with transformers
Type 'exit' or 'quit' to exit
Type '!multi' to enter multiple lines, then '!end' to finish
> What is the capital of France?
The capital of France is Paris (French: Paris). The official name of the city is "Ville de Paris"
> What question did I just ask you?
Your question was: "What is the capital of France?"
> quit
text-to-audio
text-to-audio
generates audio, the response is the path to the generated audio file.
$ llm -m transformers -o kwargs '{"generate_kwargs": {"max_new_tokens": 100}}' -o model facebook/musicgen-small "techno music"
/var/folders/b1/1j9kkk053txc5krqbh0lj5t00000gn/T/tmpoueh05y6.wav
$ llm -m transformers -o task text-to-audio "Hello world"
/var/folders/b1/1j9kkk053txc5krqbh0lj5t00000gn/T/tmpmpwhkd8p.wav
$ llm -m transformers -o task text-to-audio -o model facebook/mms-tts-eng -o output /tmp/speech.flac "Hello world"
/tmp/speech.flac
token-classification
$ llm -m transformers -o task token-classification "My name is Sarah and I live in London"
Sarah (I-PER: 0.9982994198799133)
London (I-LOC: 0.998397171497345)
translation_xx_to_yy
Substitute the from and to language codes into the task name, e.g. from en
to fr
would use task translation_en_to_fr
:
$ llm -m transformers -o task translation_en_to_fr "How old are you?"
quel âge êtes-vous?
video-classification
video-classification
task expects a video path or URL as the prompt:
$ llm -m transformers -o task video-classification https://huggingface.co/datasets/Xuehai/MMWorld/resolve/main/Amazing%20street%20dance%20performance%20from%20Futunity%20UK%20-%20Move%20It%202013/Amazing%20street%20dance%20performance%20from%20Futunity%20UK%20-%20Move%20It%202013.mp4
dancing ballet (0.006608937866985798)
spinning poi (0.006111182738095522)
air drumming (0.005756791681051254)
singing (0.005747966933995485)
punching bag (0.00565463537350297)
visual-question-answering
visual-question-answering
task requires an context
option - a file or URL to an image:
$ llm -m transformers -o task visual-question-answering -o context https://huggingface.co/datasets/Narsil/image_dummy/raw/main/lena.png "What is she wearing?"
hat (0.9480269551277161)
fedora (0.00863664224743843)
clothes (0.003124270820990205)
sun hat (0.002937435172498226)
nothing (0.0020962499547749758)
zero-shot-classification
zero-shot-classification
requires a comma separated list of labels to be specified in the context
model option:
$ llm -m transformers -o task zero-shot-classification -o context "urgent,not urgent,phone,tablet,computer" "I have a problem with my iphone that needs to be resolved asap!!"
urgent (0.5036348700523376)
phone (0.4788002371788025)
computer (0.012600351125001907)
not urgent (0.0026557915844023228)
tablet (0.0023087668232619762)
zero-shot-image-classification
zero-shot-image-classification
requires a comma separated list of labels to be specified in the context
model option. The prompt is a path or URL to an image:
$ llm -m transformers -o task zero-shot-image-classification -o context "black and white,photorealist,painting" https://huggingface.co/datasets/Narsil/image_dummy/raw/main/parrots.png
black and white (0.9736384749412537)
photorealist (0.02141517587006092)
painting (0.004946451168507338)
zero-shot-audio-classification
zero-shot-audio-classification
requires a comma separated list of labels to be specified in the context
model option.
The prompt is a path or URL to an audio:
$ llm -m transformers -o task zero-shot-audio-classification -o context "Sound of a bird,Sound of a dog" https://huggingface.co/datasets/s3prl/Nonspeech/resolve/main/animal_sound/n52.wav
Sound of a bird (0.9998763799667358)
Sound of a dog (0.00012355657236184925)
zero-shot-object-detection
zero-shot-object-detection
requires a comma separated list of labels to be specified in the context
model option. The prompt is a path or URL to an image.
The response is JSON and includes a bounding box for each label:
$ llm -m transformers -o task zero-shot-object-detection -o context "cat,couch" http://images.cocodataset.org/val2017/000000039769.jpg
[
{
"score": 0.2868139445781708,
"label": "cat",
"box": {
"xmin": 324,
"ymin": 20,
"xmax": 640,
"ymax": 373
}
},
{
"score": 0.2537268102169037,
"label": "cat",
"box": {
"xmin": 1,
"ymin": 55,
"xmax": 315,
"ymax": 472
}
},
{
"score": 0.12082991003990173,
"label": "couch",
"box": {
"xmin": 4,
"ymin": 0,
"xmax": 642,
"ymax": 476
}
}
]
Development
To set up this plugin locally, first checkout the code and install uv
.
uv sync
to create a venv
and install, then run tests:
$ uv sync --dev
$ uv run pytest
$ uv run ruff check
$ uv run ruff format --check
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for llm_transformers-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae345f741792359fa30a1ad3f464faf0896f7bb5cd26295e5c8d068dfd873f71 |
|
MD5 | 078481ff8efc72a55a4238ab626c6e36 |
|
BLAKE2b-256 | dcdfd2275e26db2e8a9c19a246d43cf0e90630584ebfa483eb1c2712abb166b5 |