Skip to main content

HuggingFace Inference integration for Vision Agents

Project description

HuggingFace Plugin for Vision Agents

HuggingFace integration for Vision Agents. Supports cloud-based inference via HuggingFace's Inference Providers API and local on-device inference via Transformers.

Installation

# Cloud inference (HuggingFace Inference API)
uv add "vision-agents[huggingface]"

# or directly
uv add vision-agents-plugins-huggingface

# Local inference (Transformers - LLM, VLM, object detection)
uv add "vision-agents-plugins-huggingface[transformers]"

# Local inference with quantization (4-bit / 8-bit)
uv add "vision-agents-plugins-huggingface[transformers-quantized]"

Cloud Inference (API-based)

Configuration

export HF_TOKEN=your_huggingface_token

Text-only LLM

from vision_agents.plugins import huggingface

llm = huggingface.LLM(
    model="meta-llama/Meta-Llama-3-8B-Instruct",
    provider="together",  # or "groq", "cerebras", etc.
)

response = await llm.simple_response("Hello, how are you?")
print(response.text)

Vision Language Model (VLM)

from vision_agents.plugins import huggingface

vlm = huggingface.VLM(
    model="Qwen/Qwen2-VL-7B-Instruct",
    fps=1,
    frame_buffer_seconds=10,
)

response = await vlm.simple_response("What do you see?")
print(response.text)

Local Inference (Transformers)

Runs models directly on your hardware (GPU/CPU/MPS). Requires the [transformers] extra.

Local LLM

from vision_agents.plugins import huggingface

llm = huggingface.TransformersLLM(
    model="meta-llama/Llama-3.2-3B-Instruct",
)


@llm.register_function()
async def get_weather(city: str) -> str:
    """Get the current weather for a city."""
    return f"The weather in {city} is sunny."


response = await llm.simple_response("What's the weather in Paris?")

Supported Providers

With 4-bit quantization (~4x memory reduction)

llm = huggingface.TransformersLLM( model="meta-llama/Llama-3.2-3B-Instruct", quantization="4bit", )


**Parameters:**

- `model` (str): HuggingFace model ID
- `device`: `"auto"`, `"cuda"`, `"mps"`, or `"cpu"`
- `quantization`: `"none"`, `"4bit"`, or `"8bit"`
- `torch_dtype`: `"auto"`, `"float16"`, `"bfloat16"`, or `"float32"`
- `max_new_tokens` (int): Max tokens per response (default: 512)

### Local VLM

```python
from vision_agents.plugins import huggingface

vlm = huggingface.TransformersVLM(
    model="Qwen/Qwen2-VL-2B-Instruct",
)

Parameters:

  • model (str): HuggingFace model ID
  • device: "auto", "cuda", "mps", or "cpu"
  • quantization: "none", "4bit", or "8bit"
  • fps (int): Frames per second to capture (default: 1)
  • frame_buffer_seconds (int): Seconds of video to buffer (default: 10)
  • max_frames (int): Max frames per inference (default: 4)

Local Object Detection

Runs detection models like RT-DETRv2 on video frames and emits DetectionCompletedEvent with bounding boxes.

from vision_agents.core import Agent
from vision_agents.plugins import huggingface

processor = huggingface.TransformersDetectionProcessor(
    model="PekingU/rtdetr_v2_r101vd",
    conf_threshold=0.5,
    fps=5,
)

agent = Agent(processors=[processor], ...)

@agent.events.subscribe
async def on_detection(event: huggingface.DetectionCompletedEvent):
    for obj in event.objects:
        print(f"{obj['label']} ({obj['confidence']:.0%})")

Parameters:

  • model (str): HuggingFace model ID (default: "PekingU/rtdetr_v2_r101vd")
  • conf_threshold (float): Confidence threshold 0-1 (default: 0.5)
  • fps (int): Frame processing rate (default: 10)
  • classes (list[str], optional): Filter to specific class names
  • device: "auto", "cuda", "mps", or "cpu"
  • annotate (bool): Draw bounding boxes on output video (default: True)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_agents_plugins_huggingface-0.4.6.tar.gz (19.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file vision_agents_plugins_huggingface-0.4.6.tar.gz.

File metadata

  • Download URL: vision_agents_plugins_huggingface-0.4.6.tar.gz
  • Upload date:
  • Size: 19.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vision_agents_plugins_huggingface-0.4.6.tar.gz
Algorithm Hash digest
SHA256 2074ec5474412e7cb29eb1da527df6fd81375b37bc4afc6862316e7f89e23517
MD5 492f28d39782dbbe8593d3a4b463ff36
BLAKE2b-256 0236f8a70e05bc3bca22231296057376dc811decf53fc03bc3a8b8d6da281bee

See more details on using hashes here.

File details

Details for the file vision_agents_plugins_huggingface-0.4.6-py3-none-any.whl.

File metadata

  • Download URL: vision_agents_plugins_huggingface-0.4.6-py3-none-any.whl
  • Upload date:
  • Size: 32.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vision_agents_plugins_huggingface-0.4.6-py3-none-any.whl
Algorithm Hash digest
SHA256 7c67bdc8b7eb4c8708ff91f7c54b9df28546356e6e88e5d445b022a29e7f8a94
MD5 99cabfce45d96b26a0f536cc534fa957
BLAKE2b-256 733115c537790a5e432d428a6ceb22c24415e5be4d17313348c33e31cf7d3530

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page