Abstractions for AI tasks following the Open Inference Protocol v2, with a HuggingFace Transformers reference implementation

These details have not been verified by PyPI

Project links

Project description

Task Inference

GitHub Release Status

A Python library that provides task-oriented abstractions for AI inference, bridging HuggingFace Transformers pipeline tasks with the Open Inference Protocol v2 (OIP v2) tensor format.

Key Design

Each task is modelled in three layers:

Layer	Role
Protocol (`protocol/v2.py`)	Pydantic models for OIP v2 `InferenceRequest` / `InferenceResponse`
Task (`tasks/`)	Domain-specific input/output Pydantic schemas; `process` left abstract
Implementation (`implementations/transformers/`)	HuggingFace Transformers reference backend that fulfils `process`

This separation allows swapping backends (ONNX, TensorRT, REST endpoint, …) without changing the task schema or OIP conversion logic.

Input / Output schemas

Every task exposes a pair of Pydantic models (e.g. ImageClassificationInput / ImageClassificationOutput). These models own the conversion to and from OIP v2 tensors:

Method	Direction
`XxxInput.to_inference_request()`	Python input → `InferenceRequest`
`XxxInput.from_inference_request(request)`	`InferenceRequest` → Python input
`XxxOutput.to_inference_response(model_name)`	Python output → `InferenceResponse`
`XxxOutput.from_inference_response(response)`	`InferenceResponse` → Python output

The process(inputs: XxxInput) -> XxxOutput method on task classes works entirely with these domain objects — no raw OIP tensors required.

Supported Tasks

Vision

Task	Class	Default Model
Image Classification	`TransformersImageClassificationTask`	`google/vit-base-patch16-224`
Image Segmentation	`TransformersImageSegmentationTask`	`facebook/mask2former-swin-large-coco-panoptic`
Object Detection	`TransformersObjectDetectionTask`	`facebook/detr-resnet-50`
Depth Estimation	`TransformersDepthEstimationTask`	`Intel/dpt-large`
Mask Generation	`TransformersMaskGenerationTask`	`facebook/sam-vit-base`
Visual Question Answering	`TransformersVQATask`	`dandelin/vilt-b32-finetuned-vqa`
Image Anonymization	`TransformersImageAnonymizationTask`	`hustvl/yolos-tiny`

Audio

Task	Class	Default Model
Automatic Speech Recognition	`TransformersASRTask`	`openai/whisper-base`
Audio Classification	`TransformersAudioClassificationTask`	`superb/wav2vec2-base-superb-ks`

Installation

# Core (schemas + OIP v2 protocol only)
pip install task-inference

# With HuggingFace Transformers backend
pip install "task-inference[transformers]"

# With audio support
pip install "task-inference[all]"

Quick Start

Via the factory (recommended)

from task_inference import create_task

with open("cat.jpg", "rb") as f:
    image_bytes = f.read()

task = create_task(
    backend="transformers",
    task_name="image-classification",
    model_name="google/vit-base-patch16-224",
    model_params={"device": "cpu"},
)

from task_inference.tasks.vision.image_classification import (
    ImageClassificationInput,
    ImageClassificationOutput,
)

inp = ImageClassificationInput(image=image_bytes, top_k=3)
resp = task(inp.to_inference_request())
result = ImageClassificationOutput.from_inference_response(resp)
for r in result.results:
    print(r.label, r.score)

model_params is forwarded directly to the backend constructor, so any backend-specific keyword argument (e.g. device, chunk_length_s, points_per_batch) can be passed here.

To discover what backends and task names are available:

from task_inference import supported_tasks

print(supported_tasks())
# {'transformers': ['audio-classification', 'automatic-speech-recognition', ...]}

Direct instantiation

from task_inference.implementations.transformers.vision import (
    TransformersImageClassificationTask,
)
from task_inference.tasks.vision.image_classification import (
    ImageClassificationInput,
    ImageClassificationOutput,
)

task = TransformersImageClassificationTask(model_name="google/vit-base-patch16-224")

# Call via OIP v2 round-trip
inp = ImageClassificationInput(image=image_bytes, top_k=3)
resp = task(inp.to_inference_request())
result = ImageClassificationOutput.from_inference_response(resp)
for r in result.results:
    print(r.label, r.score)

# Convenience wrapper - build input from keyword arguments, returns InferenceResponse
resp = task.run(image=image_bytes, top_k=3)
result = ImageClassificationOutput.from_inference_response(resp)

OIP v2 round-trip

The input/output models handle all serialisation, so you can integrate with any OIP v2-compatible server without touching the task implementation:

from task_inference.tasks.vision.image_classification import (
    ImageClassificationInput,
    ImageClassificationOutput,
)

# --- Client side ---
inputs  = ImageClassificationInput(image=image_bytes, top_k=3)
request = inputs.to_inference_request()   # → InferenceRequest (send over HTTP)

# --- Server side ---
response = task(request)                  # returns InferenceResponse directly

# --- Client side (parse response) ---
output = ImageClassificationOutput.from_inference_response(response)
for r in output.results:
    print(r.label, r.score)

Factory reference

`create_task(backend, task_name, model_name=None, model_params=None)`

Parameter	Type	Description
`backend`	`str`	Backend name — currently `"transformers"`
`task_name`	`str`	Task identifier (see table below)
`model_name`	`str \| None`	HuggingFace model id or local path. When `None` the backend's built-in default model is used.
`model_params`	`dict \| None`	Extra keyword arguments passed to the constructor (e.g. `device`, `chunk_length_s`)

Raises ValueError for unknown backends or task names.

`supported_tasks(backend=None)`

Returns a dict[str, list[str]] mapping each backend to its supported task names. Pass a backend name to filter to a single backend.

Task name reference

Task name	Input class	Output class
`image-classification`	`ImageClassificationInput`	`ImageClassificationOutput`
`object-detection`	`ObjectDetectionInput`	`ObjectDetectionOutput`
`depth-estimation`	`DepthEstimationInput`	`DepthEstimationOutput`
`image-segmentation`	`ImageSegmentationInput`	`ImageSegmentationOutput`
`image-anonymization`	`ImageAnonymizationInput`	`ImageAnonymizationOutput`
`mask-generation`	`MaskGenerationInput`	`MaskGenerationOutput`
`visual-question-answering`	`VQAInput`	`VQAOutput`
`image-text-to-text`	`ImageTextToTextInput`	`ImageTextToTextOutput`
`zero-shot-image-classification`	`ZeroShotImageClassificationInput`	`ZeroShotImageClassificationOutput`
`zero-shot-object-detection`	`ZeroShotObjectDetectionInput`	`ZeroShotObjectDetectionOutput`
`audio-classification`	`AudioClassificationInput`	`AudioClassificationOutput`
`automatic-speech-recognition`	`ASRInput`	`ASROutput`

Project Structure

src/task_inference/
├── factory.py          # create_task() / supported_tasks() entry points
├── protocol/           # OIP v2 Pydantic models
│   └── v2.py
├── tasks/              # Abstract task definitions + input/output schemas
│   ├── base.py
│   ├── vision/         # Image-based tasks
│   └── audio/          # Audio-based tasks
├── implementations/
│   ├── transformers/   # HuggingFace reference backend
│   │   ├── base.py     # Shared image/audio helpers
│   │   ├── vision/
│   │   └── audio/
│   └── onnxruntime/    # ONNX Runtime backend
│       ├── base.py     # Shared ORT helpers
│       ├── vision/
│       ├── audio/
│       └── adapters/   # Dialect adapters (auto-detected from model I/O)
└── utils.py            # Image/audio encode-decode helpers

Extending

Implement a new backend by subclassing the relevant task and overriding process:

from task_inference.tasks.vision.image_classification import (
    ImageClassificationInput,
    ImageClassificationOutput,
    ImageClassificationTask,
)

class MyOnnxImageClassificationTask(ImageClassificationTask):
    def process(self, inputs: ImageClassificationInput) -> ImageClassificationOutput:
        # your ONNX / TensorRT / remote-endpoint logic here
        ...

ONNX Runtime adapters

The built-in onnxruntime backend supports multiple model families through a dialect-adapter layer that auto-detects the correct tensor contract from the model's I/O tensor names at load time. See docs/onnx-adapters.md for:

The full dialect reference (tensor signatures, detection rules) for all 14 built-in dialects
Model acquisition instructions (optimum-cli / torch.onnx.export) for each task
Step-by-step guide for adding a custom adapter dialect

Security Policy

The current release is the supported version. Security fixes are released together with all other fixes in each new release.

If you discover a security vulnerability in this project, please do not open a public issue.

Instead, report it privately by emailing us at digitalhub@fbk.eu. Include as much detail as possible to help us understand and address the issue quickly and responsibly.

Contributing

To report a bug or request a feature, please first check the existing issues to avoid duplicates. If none exist, open a new issue with a clear title and a detailed description, including any steps to reproduce if it's a bug.

To contribute code, start by forking the repository. Clone your fork locally and create a new branch for your changes. Make sure your commits follow the Conventional Commits v1.0 specification to keep history readable and consistent.

Once your changes are ready, push your branch to your fork and open a pull request against the main branch. Be sure to include a summary of what you changed and why. If your pull request addresses an issue, mention it in the description (e.g., “Closes #123”).

Please note that new contributors may be asked to sign a Contributor License Agreement (CLA) before their pull requests can be merged. This helps us ensure compliance with open source licensing standards.

We appreciate contributions and help in improving the project!

Authors

This project is developed and maintained by DSLab – Fondazione Bruno Kessler, with contributions from the open source community. A complete list of contributors is available in the project’s commit history and pull requests.

For questions or inquiries, please contact: digitalhub@fbk.eu

Copyright and license

This project is licensed under the Apache License, Version 2.0. You may not use this file except in compliance with the License. Ownership of contributions remains with the original authors and is governed by the terms of the Apache 2.0 License, including the requirement to grant a license to the project.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1b1 pre-release

Mar 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

task_inference-0.1.1b1.tar.gz (7.5 MB view details)

Uploaded Mar 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

task_inference-0.1.1b1-py3-none-any.whl (104.8 kB view details)

Uploaded Mar 18, 2026 Python 3

File details

Details for the file task_inference-0.1.1b1.tar.gz.

File metadata

Download URL: task_inference-0.1.1b1.tar.gz
Upload date: Mar 18, 2026
Size: 7.5 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for task_inference-0.1.1b1.tar.gz
Algorithm	Hash digest
SHA256	`4ef52cc011d87dc13cf86f01f72bece1828d7ed7965dd2eeb3c07dcd2edcb4f4`
MD5	`83dcbfbe537d17874684fe9986ea8aa4`
BLAKE2b-256	`edd1792254386366144fc5c1374670536537227e2add52a68f0b76215e9873d2`

See more details on using hashes here.

File details

Details for the file task_inference-0.1.1b1-py3-none-any.whl.

File metadata

Download URL: task_inference-0.1.1b1-py3-none-any.whl
Upload date: Mar 18, 2026
Size: 104.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for task_inference-0.1.1b1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`952840faa5b6e800ad61eee3e040d8b61c4e6ceab3efaeeb4ae4117c6ce0b1bd`
MD5	`cd9656d31c934f4566efc767fe8653dd`
BLAKE2b-256	`be9410f7e3f7930df2ca81dd3d33b98fab0f636e231975e30232ea357cf09441`

See more details on using hashes here.

task-inference 0.1.1b1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Task Inference

Key Design

Input / Output schemas

Supported Tasks

Vision

Audio

Installation

Quick Start

Via the factory (recommended)

Direct instantiation

OIP v2 round-trip

Factory reference

create_task(backend, task_name, model_name=None, model_params=None)

supported_tasks(backend=None)

Task name reference

Project Structure

Extending

ONNX Runtime adapters

Security Policy

Contributing

Authors

Copyright and license

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`create_task(backend, task_name, model_name=None, model_params=None)`

`supported_tasks(backend=None)`