Skip to main content

HuggingFace Inference integration for Vision Agents

Project description

HuggingFace Plugin for Vision Agents

HuggingFace Inference integration for Vision Agents. Supports both text-only LLM and vision language models (VLM) through HuggingFace's Inference Providers API.

Installation

uv add vision-agents[huggingface]

Configuration

Set your HuggingFace API token:

export HF_TOKEN=your_huggingface_token

Usage

Text-only LLM

from vision_agents.plugins import huggingface

llm = huggingface.LLM(
    model="meta-llama/Meta-Llama-3-8B-Instruct",
    provider="together",  # optional: use "auto" or omit to let HuggingFace auto-select based on your settings
)

response = await llm.simple_response("Hello, how are you?")
print(response.text)

Vision Language Model (VLM)

from vision_agents.plugins import huggingface

vlm = huggingface.VLM(
    model="Qwen/Qwen2-VL-7B-Instruct",
    fps=1,
    frame_buffer_seconds=10,
)

# VLM automatically buffers video frames when used with an Agent
response = await vlm.simple_response("What do you see?")
print(response.text)

With Function Calling

from vision_agents.plugins import huggingface

llm = huggingface.LLM(model="meta-llama/Meta-Llama-3-8B-Instruct")

@llm.register_function()
def get_weather(city: str) -> str:
    """Get the current weather for a city."""
    return f"The weather in {city} is sunny."

response = await llm.simple_response("What's the weather in Paris?")

Supported Providers

HuggingFace's Inference Providers API supports multiple backends:

  • Together AI
  • Groq
  • Cerebras
  • Replicate
  • Fireworks
  • And more

Specify a provider explicitly or let HuggingFace auto-select:

llm = huggingface.LLM(
    model="meta-llama/Meta-Llama-3-8B-Instruct",
    provider="groq",
)

API Reference

huggingface.LLM

Text-only language model integration.

Parameters:

  • model (str): HuggingFace model ID
  • api_key (str, optional): HuggingFace API token (defaults to HF_TOKEN env var)
  • provider (str, optional): Inference provider name

huggingface.VLM

Vision language model integration with video frame buffering.

Parameters:

  • model (str): HuggingFace model ID
  • api_key (str, optional): HuggingFace API token (defaults to HF_TOKEN env var)
  • provider (str, optional): Inference provider name
  • fps (int): Frames per second to buffer (default: 1)
  • frame_buffer_seconds (int): Seconds of video to buffer (default: 10)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_agents_plugins_huggingface-0.3.5.tar.gz (8.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file vision_agents_plugins_huggingface-0.3.5.tar.gz.

File metadata

  • Download URL: vision_agents_plugins_huggingface-0.3.5.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vision_agents_plugins_huggingface-0.3.5.tar.gz
Algorithm Hash digest
SHA256 48e7d97b52eece09894c5a99dadae8450cab4a8c276f8eb3e7cfc70f4aba81cd
MD5 9f20f4884fd32ea4c8972ee331ee2267
BLAKE2b-256 c94e5a7c35fd387248df7935af3e7580e4cbbcf39a3c61662220c0dfca516f0b

See more details on using hashes here.

File details

Details for the file vision_agents_plugins_huggingface-0.3.5-py3-none-any.whl.

File metadata

  • Download URL: vision_agents_plugins_huggingface-0.3.5-py3-none-any.whl
  • Upload date:
  • Size: 14.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vision_agents_plugins_huggingface-0.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 a655ff30832a509e65504ac162cc850db0d07c32ff4887828ad83af0ef7dbd77
MD5 566fac39830c4bc57c0e0c7983010b87
BLAKE2b-256 c35f0bdf9f14b45826c8b78e986dff383079f2171b59791ff1cf2193af9e4158

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page