Zu HuggingFace adapter: task models as typed tools/detectors/validators, behind the supply-chain guards
Project description
zu-huggingface
HuggingFace models behind Zu's typed ports. HuggingFace is not a model — it is the largest hub of open models across every modality — so "supporting it" means three different things, and this package draws the line cleanly (Engineering Design §8.3–8.5).
Chat / vision-language models as the policy — no code here
A chat or vision-language model that is the brain speaks the OpenAI chat API
on all three HuggingFace serving surfaces (the router's /v1, a dedicated
Endpoint's /v1, or a local vLLM server). So a HuggingFace model as the policy
is the existing openai-compatible provider pointed at a HuggingFace base URL —
the OpenRouter story exactly, no new adapter:
# agent.yaml — a HuggingFace multimodal model as the policy
model: meta-llama/Llama-Vision-... # any chat / VLM id on the Hub
provider: openai-compatible
options:
base_url: https://router.huggingface.co/v1 # or an Endpoint, or local vLLM
api_key_env: HF_TOKEN
The three serving surfaces are one adapter + config — only the base_url
changes (the path is always <base_url>/chat/completions):
| Surface | base_url |
|---|---|
| Inference Providers router | https://router.huggingface.co/v1 |
| Dedicated Inference Endpoint | https://<id>.<region>.aws.endpoints.huggingface.cloud/v1 |
| Local vLLM | http://localhost:8000/v1 |
A VLM policy (an image in the chat request) rides the same adapter+config:
a multimodal content list ({type:"text"} + {type:"image_url", image_url:{url: "data:<mime>;base64,…"}}) passes straight through to the wire. This is proven
offline by zu-providers/tests/test_hf_router_policy.py (an httpx.MockTransport
asserting the request path, the Bearer from HF_TOKEN, the body, and that the
response parses identically across all three base URLs — no live call).
Task models as tools, detectors, validators — this package
Most HuggingFace models are not chat models (OCR, ASR, detection, embeddings, classification, …), so they enter through the non-policy ports by their role (the port is the role, assigned per agent — §4.5):
| Role | Class | Task |
|---|---|---|
| Tool | Transcribe (hf_transcribe) |
speech → text (ASR) |
| Tool | ImageToText (hf_image_to_text) |
image → text (OCR / caption) |
| Tool | DetectObjects (hf_detect) |
image → labelled boxes |
| Tool | Embed (hf_embed) |
text → vector (retrieval / grounding) |
| Tool | Classify (hf_classify) |
text → labels |
| Tool | ZeroShotClassify (hf_zero_shot) |
text + labels → scores |
| Tool | Summarize (hf_summarize) |
text → text |
| Tool | Translate (hf_translate) |
text → text |
| Tool | SegmentImage (hf_segment) |
image → labelled masks |
| Tool | EstimateDepth (hf_depth) |
image → depth map (base64 PNG) |
| Tool | AskDocument (hf_doc_qa) |
document image + question → answer |
| Tool | AskImage (hf_vqa) |
image + question → answer (VQA) |
| Tool | Speak (hf_speak) |
text → audio (base64 WAV) |
| Tool | ClassifyAudio (hf_classify_audio) |
audio → labels (same shape as Classify) |
| Tool | VlmDescribe (hf_vlm) |
image + text prompt → text (VLM-as-tool) |
| Tool | AskTable (hf_table_qa) |
table + question → answer |
| Tool | ClassifyTable (hf_tabular_classify) |
rows → label per row (hosted-only) |
| Tool | PredictTable (hf_tabular_regress) |
rows → number per row (hosted-only) |
| Detector | HfClassifierDetector |
classify an observation → ESCALATE/stop |
| Validator | HfClassifierValidator |
classify the result → fail/RETRY |
VLM-as-tool. VlmDescribe exposes a vision-language model's vision as a
verb (not the policy): a text policy can call hf_vlm(image, prompt) to
get a description/answer about a picture and then reason over it. It rides the
client's image_text_to_text path — a multimodal chat call hosted (a text +
image_url data-URL message), an image-text-to-text pipeline local — over the
one HfClient seam, exactly like every other tool.
Tabular (ClassifyTable/PredictTable) is hosted-only: tabular models
are sklearn/tabular-backed on the Hub and served via the Inference API, so the
local PipelineBackend raises a clear hosted-only error rather than fetch a
model (it therefore cannot bypass the supply-chain guard).
Each is parameterised by a model id (and the role wrappers by the labels that matter), so they are wired by reference in config per agent rather than as zero-config entry points:
tools:
- ref: zu_huggingface.tools:Transcribe
args: { model: openai/whisper-large-v3 }
- ref: zu_huggingface.tools:Embed
args: { model: BAAI/bge-large-en-v1.5 }
detectors:
- ref: zu_huggingface.roles:HfClassifierDetector
args: { model: facebook/bart-large-mnli, candidate_labels: ["safe","unsafe"], escalate_on: ["unsafe"] }
The typed multimodal Content (Text/Image/Audio) from zu_core.content
is the currency in and out — which is what lets a non-chat model slot into the
loop as cleanly as a chat one.
Hosted vs local — one seam
Every tool depends only on the HfClient seam, so the same tool works:
- Hosted —
InferenceClientBackendwrapshuggingface_hub.InferenceClient(the serverless router or a dedicated Endpoint). Egresses torouter.huggingface.co;HF_TOKENis read from the environment inside the backend.pip install 'zu-huggingface[hosted]'. - Local —
PipelineBackendwrapstransformers.pipelinefor the air-gapped / on-prem case. Reaches no network. Every pipeline is built through the supply-chain guards.pip install 'zu-huggingface[local]'(plus a backend such astorch).
The supply chain — safe by default (§8.3)
Pulling a model from the Hub is a supply-chain surface. supply_chain.py
enforces, by default:
- Pin + hash. A
ModelPinshould carry a full commit-sharevision;verify_file_hashchecks a downloaded file's sha256. - safetensors, not pickle.
verify_model_sourcerejects.bin/.pt/.ckptcheckpoints (which execute on deserialisation) unless explicitly allowed. - No remote code.
safe_pipeline_kwargsforcestrust_remote_code=False;assert_no_remote_coderaises if it is relaxed.
The safe configuration is the default — there is nothing to turn on to be safe, only flags a reviewed case may relax.
Tests
Offline, no network, no model download: the tools and role wrappers are
exercised against a fake HfClient, and the supply-chain guards are pure.
uv run pytest packages/zu-huggingface.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zu_huggingface-0.7.0.tar.gz.
File metadata
- Download URL: zu_huggingface-0.7.0.tar.gz
- Upload date:
- Size: 26.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c72e91e9e59b9afac517251da76c61369dd9153cc731a7d44041f439d2f01c7
|
|
| MD5 |
e221ee13dfcb917fb490aa1b66f0237a
|
|
| BLAKE2b-256 |
e876e81ff4da288159b5942125ed3ea00e1b0e7fcf07c74bde2c130020786175
|
File details
Details for the file zu_huggingface-0.7.0-py3-none-any.whl.
File metadata
- Download URL: zu_huggingface-0.7.0-py3-none-any.whl
- Upload date:
- Size: 20.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f17a8ba499c7bb1014b0e73af8774e1cedfce6b3b64ba77c560ac8b20a4febac
|
|
| MD5 |
52b10053d58c9bd27489f6e3079ad4bf
|
|
| BLAKE2b-256 |
94f6ff33766dfeb0beebf57bff418bd733f87a1a1393d0f6433e873c0f541178
|