Skip to main content

HuggingFace Inference integration for Vision Agents

Project description

HuggingFace Plugin for Vision Agents

HuggingFace integration for Vision Agents. Supports cloud-based inference via HuggingFace's Inference Providers API and local on-device inference via Transformers.

Installation

# Cloud inference (HuggingFace Inference API)
uv add "vision-agents[huggingface]"

# or directly
uv add vision-agents-plugins-huggingface

# Local inference (Transformers - LLM, VLM, object detection)
uv add "vision-agents-plugins-huggingface[transformers]"

# Local inference with quantization (4-bit / 8-bit)
uv add "vision-agents-plugins-huggingface[transformers-quantized]"

Cloud Inference (API-based)

Configuration

export HF_TOKEN=your_huggingface_token

Text-only LLM

from vision_agents.plugins import huggingface

llm = huggingface.LLM(
    model="meta-llama/Meta-Llama-3-8B-Instruct",
    provider="together",  # or "groq", "cerebras", etc.
)

response = await llm.simple_response("Hello, how are you?")
print(response.text)

Vision Language Model (VLM)

from vision_agents.plugins import huggingface

vlm = huggingface.VLM(
    model="Qwen/Qwen2-VL-7B-Instruct",
    fps=1,
    frame_buffer_seconds=10,
)

response = await vlm.simple_response("What do you see?")
print(response.text)

Local Inference (Transformers)

Runs models directly on your hardware (GPU/CPU/MPS). Requires the [transformers] extra.

Local LLM

from vision_agents.plugins import huggingface

llm = huggingface.TransformersLLM(
    model="meta-llama/Llama-3.2-3B-Instruct",
)


@llm.register_function()
async def get_weather(city: str) -> str:
    """Get the current weather for a city."""
    return f"The weather in {city} is sunny."


response = await llm.simple_response("What's the weather in Paris?")

Supported Providers

With 4-bit quantization (~4x memory reduction)

llm = huggingface.TransformersLLM( model="meta-llama/Llama-3.2-3B-Instruct", quantization="4bit", )


**Parameters:**

- `model` (str): HuggingFace model ID
- `device`: `"auto"`, `"cuda"`, `"mps"`, or `"cpu"`
- `quantization`: `"none"`, `"4bit"`, or `"8bit"`
- `torch_dtype`: `"auto"`, `"float16"`, `"bfloat16"`, or `"float32"`
- `max_new_tokens` (int): Max tokens per response (default: 512)

### Local VLM

```python
from vision_agents.plugins import huggingface

vlm = huggingface.TransformersVLM(
    model="Qwen/Qwen2-VL-2B-Instruct",
)

Parameters:

  • model (str): HuggingFace model ID
  • device: "auto", "cuda", "mps", or "cpu"
  • quantization: "none", "4bit", or "8bit"
  • fps (int): Frames per second to capture (default: 1)
  • frame_buffer_seconds (int): Seconds of video to buffer (default: 10)
  • max_frames (int): Max frames per inference (default: 4)

Local Object Detection

Runs detection models like RT-DETRv2 on video frames and emits DetectionCompletedEvent with bounding boxes.

from vision_agents.core import Agent
from vision_agents.plugins import huggingface

processor = huggingface.TransformersDetectionProcessor(
    model="PekingU/rtdetr_v2_r101vd",
    conf_threshold=0.5,
    fps=5,
)

agent = Agent(processors=[processor], ...)

@agent.events.subscribe
async def on_detection(event: huggingface.DetectionCompletedEvent):
    for obj in event.objects:
        print(f"{obj['label']} ({obj['confidence']:.0%})")

Parameters:

  • model (str): HuggingFace model ID (default: "PekingU/rtdetr_v2_r101vd")
  • conf_threshold (float): Confidence threshold 0-1 (default: 0.5)
  • fps (int): Frame processing rate (default: 10)
  • classes (list[str], optional): Filter to specific class names
  • device: "auto", "cuda", "mps", or "cpu"
  • annotate (bool): Draw bounding boxes on output video (default: True)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_agents_plugins_huggingface-0.4.4.tar.gz (19.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file vision_agents_plugins_huggingface-0.4.4.tar.gz.

File metadata

  • Download URL: vision_agents_plugins_huggingface-0.4.4.tar.gz
  • Upload date:
  • Size: 19.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vision_agents_plugins_huggingface-0.4.4.tar.gz
Algorithm Hash digest
SHA256 d8d6deb31a84bddaa7d270d56bb474b6c1c0acc3965f83179b21b252904b6c75
MD5 1ef0503e63d7b64a711fadd3c782563c
BLAKE2b-256 36c17f0e970dbc5fa354db5487fb5ac14a8ed3f165d758a186c9a18c788da8da

See more details on using hashes here.

File details

Details for the file vision_agents_plugins_huggingface-0.4.4-py3-none-any.whl.

File metadata

  • Download URL: vision_agents_plugins_huggingface-0.4.4-py3-none-any.whl
  • Upload date:
  • Size: 32.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vision_agents_plugins_huggingface-0.4.4-py3-none-any.whl
Algorithm Hash digest
SHA256 c352937bfd84666780c2e7bf046adf308726877515146a2f4660c2e2468c2ec5
MD5 7e3b2f5645eb67428f646f5760e9537b
BLAKE2b-256 c4f62842933d70de3d7e632a3eb5c2c17a1a2306747f2e65ec29d861df476d2a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page