A Python library for inference-time scaling LLMs

These details have not been verified by PyPI

Project links

Homepage

Project description

`its-hub`: A Python library for inference-time scaling

its_hub is a Python library for inference-time scaling of LLMs, focusing on mathematical reasoning tasks.

ITS Hub algorithms: Self-Consistency, Best-of-N, and Particle Filtering

📚 Documentation

For comprehensive documentation, including installation guides, tutorials, and API reference, visit:

https://ai-innovation.team/its_hub

Installation

its_hub provides a minimal core focused on algorithms, with optional language model implementations.

Core Installation (Algorithms Only)

For gateway integration - just algorithms and interfaces, minimal dependencies:

pip install its_hub

This includes:

✓ Self-Consistency and Best-of-N algorithms
✓ Abstract base classes (AbstractLanguageModel, AbstractOutcomeRewardModel)
✓ Only 2 dependencies: numpy, typing-extensions

With Language Model Support

For standalone use - includes OpenAI-compatible language model implementation:

pip install its_hub[lm]

Adds: OpenAICompatibleLanguageModel, LLMJudge, StepGeneration (requires openai, aiohttp, backoff)

With Experimental Algorithms

For experimental features - includes beam search and particle filtering:

pip install its_hub[experimental]

Adds: Process reward models, beam search, particle filtering algorithms

Development Installation

git clone https://github.com/Red-Hat-AI-Innovation-Team/its_hub.git
cd its_hub
pip install -e ".[dev]"
# or using uv:
uv sync --extra dev

Quick Start

Example 1: Gateway Integration (Core Installation)

Installation required: pip install its_hub (core only, minimal dependencies)

Gateway integration requires implementing two interfaces: AbstractLanguageModel for LM calls and AbstractOrchestrator for managing parallel execution with concurrency control and rate limiting.

import asyncio

from its_hub import AbstractLanguageModel, AbstractOrchestrator, SelfConsistency

# Step 1: Implement AbstractLanguageModel with your gateway's LM client
class MyGatewayLM(AbstractLanguageModel):
    def __init__(self, gateway_client):
        self.client = gateway_client

    async def agenerate_single(self, messages, stop=None, **kwargs):
        response = await self.client.generate(messages, stop=stop, **kwargs)
        return {"role": "assistant", "content": response}

# Step 2: Implement AbstractOrchestrator for concurrency control
# (or use the built-in LMOrchestrator from its_hub[lm])
class MyGatewayOrchestrator(AbstractOrchestrator):
    async def agenerate(self, lm, messages_lst, **kwargs):
        # Manage parallel calls with your gateway's rate limits
        ...

async def main():
    lm = MyGatewayLM(your_gateway_client)
    orchestrator = MyGatewayOrchestrator()
    algorithm = SelfConsistency(orchestrator=orchestrator)
    result = await algorithm.ainfer(lm, "What is 2+2?", budget=5)
    print(result)  # {"role": "assistant", "content": "4", ...}

asyncio.run(main())

The AbstractOrchestrator is the central coordination point — it controls how algorithms fan out parallel LM calls, enforces rate limits, and provides structured error handling. See Orchestration for details.

Example 2: Standalone Use with OpenAI-Compatible LM

Installation required: pip install its_hub[lm]

import asyncio

from its_hub import OpenAICompatibleLanguageModel, SelfConsistency

lm = OpenAICompatibleLanguageModel(
    endpoint="https://api.openai.com/v1",
    api_key="your-api-key",
    model_name="gpt-4o-mini",
)

algorithm = SelfConsistency()
result = algorithm.infer(lm, "What is the capital of France?", budget=3)
print(result)  # Most common answer from 3 generations

# Close lm for resource cleanup
asyncio.run(lm.close())

Example 3: Best-of-N with LLM Judge

Installation required: pip install its_hub[lm]

import asyncio

from its_hub import BestOfN, LLMJudge, OpenAICompatibleLanguageModel

lm = OpenAICompatibleLanguageModel(
    endpoint="https://api.openai.com/v1",
    api_key="your-api-key",
    model_name="gpt-4o-mini",
)

judge = LLMJudge(lm=lm, fallback_score=5.0)
algorithm = BestOfN(orm=judge)
result = algorithm.infer(lm, "Write a sorting function", budget=5)
print(result)  # Best response as judged by LLM

# Close lm for resource cleanup
asyncio.run(lm.close())

Key Features

🔬 Multiple Algorithms: Self-Consistency, Best-of-N, Beam Search (experimental), Particle Filtering (experimental)
🚀 Gateway Integration: Clean abstractions (AbstractLanguageModel, AbstractOrchestrator) for easy integration with AI gateways
🔄 Orchestration: AbstractOrchestrator provides structured concurrency, rate limiting, and error propagation for parallel LM calls — essential for production gateway deployments
🧮 Math-Optimized: Built for mathematical reasoning tasks
⚡ Async-First: ainfer() is the primary method; infer() is a sync wrapper. Concurrent generation with limits and error handling
🎯 Minimal Core: Only 2 dependencies (numpy, typing-extensions) for core install

Coding Agent Plugin

its-hub is available as a plugin for two coding agents, bringing inference-time scaling directly into your coding workflow.

Claude Code

Via org marketplace (recommended — includes all Red Hat AI plugins):

/plugin marketplace add Red-Hat-AI-Innovation-Team/plugins
/plugin install its-hub@Red-Hat-AI-Innovation-Team/plugins

Via this repo directly:

/plugin marketplace add Red-Hat-AI-Innovation-Team/its_hub
/plugin install its-hub@Red-Hat-AI-Innovation-Team/its_hub

From a local clone:

git clone https://github.com/Red-Hat-AI-Innovation-Team/its_hub.git
/plugin marketplace add /path/to/its_hub

Codex CLI

codex plugin marketplace add Red-Hat-AI-Innovation-Team/plugins

Then install the plugin from the marketplace. See .codex-plugin/INSTALL.md for manual installation.

After Installing

Invoke the setup-guide skill to configure your model endpoint and algorithm.

Skill	Description
`setup-guide`	Guided first-time configuration
`inference-scaling`	Run inference-time scaling on a single prompt
`batch-scaling`	Batch scaling from a JSONL/CSV/TXT file

For detailed documentation, visit: https://ai-innovation.team/its_hub

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.1.0

Jun 18, 2026

1.0.0

Apr 9, 2026

0.3.5

Nov 26, 2025

0.3.4

Nov 7, 2025

0.3.3

Oct 22, 2025

0.3.2

Oct 21, 2025

0.3.1

Oct 17, 2025

0.3.0

Oct 16, 2025

0.2.5

Oct 15, 2025

0.2.4

Oct 6, 2025

0.2.3a1 pre-release

Aug 18, 2025

0.2.2a1 pre-release

Jul 17, 2025

0.2.1a2 pre-release

Jul 1, 2025

0.2.1a1 pre-release

Jul 1, 2025

0.2.0a1 pre-release

Jun 13, 2025

0.1.3a3 pre-release

Jun 10, 2025

0.1.3a2 pre-release

May 16, 2025

0.1.3a1 pre-release

May 16, 2025

0.1.2a1 pre-release

May 15, 2025

0.1.1a1 pre-release

Apr 29, 2025

0.1.0a2 pre-release

Apr 29, 2025

0.1.0a1 pre-release

Apr 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

its_hub-1.1.0.tar.gz (826.3 kB view details)

Uploaded Jun 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

its_hub-1.1.0-py3-none-any.whl (50.1 kB view details)

Uploaded Jun 18, 2026 Python 3

File details

Details for the file its_hub-1.1.0.tar.gz.

File metadata

Download URL: its_hub-1.1.0.tar.gz
Upload date: Jun 18, 2026
Size: 826.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for its_hub-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`11909f0d0b9a7559425f556dd176b5f63dfb8ff93fa146efba06a15756adb701`
MD5	`ff0bf72b670f2409eed302baa9052e10`
BLAKE2b-256	`ba664adfc1389cb0a92faa83c2fd7105741cf5424410ed1aafeeae3b72a4cb3c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for its_hub-1.1.0.tar.gz:

Publisher: release.yaml on Red-Hat-AI-Innovation-Team/its_hub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: its_hub-1.1.0.tar.gz
- Subject digest: 11909f0d0b9a7559425f556dd176b5f63dfb8ff93fa146efba06a15756adb701
- Sigstore transparency entry: 1862736016
- Sigstore integration time: Jun 18, 2026
Source repository:
- Permalink: Red-Hat-AI-Innovation-Team/its_hub@8f4f34686f377813a01b2ca70e4ee4657867d37b
- Branch / Tag: refs/tags/v1.1.0
- Owner: https://github.com/Red-Hat-AI-Innovation-Team
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yaml@8f4f34686f377813a01b2ca70e4ee4657867d37b
- Trigger Event: release

File details

Details for the file its_hub-1.1.0-py3-none-any.whl.

File metadata

Download URL: its_hub-1.1.0-py3-none-any.whl
Upload date: Jun 18, 2026
Size: 50.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for its_hub-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d8d65e2309ab0e759b48438839f71006a8ded65db81cc69292a48fdc2c4d1a06`
MD5	`134a6d83622f30aa0c662089228676b0`
BLAKE2b-256	`763327c8ae95613ef7353c1e1dfb1f75a005effbefd51644f70489e853830bea`

See more details on using hashes here.

Provenance

The following attestation bundles were made for its_hub-1.1.0-py3-none-any.whl:

Publisher: release.yaml on Red-Hat-AI-Innovation-Team/its_hub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: its_hub-1.1.0-py3-none-any.whl
- Subject digest: d8d65e2309ab0e759b48438839f71006a8ded65db81cc69292a48fdc2c4d1a06
- Sigstore transparency entry: 1862736274
- Sigstore integration time: Jun 18, 2026
Source repository:
- Permalink: Red-Hat-AI-Innovation-Team/its_hub@8f4f34686f377813a01b2ca70e4ee4657867d37b
- Branch / Tag: refs/tags/v1.1.0
- Owner: https://github.com/Red-Hat-AI-Innovation-Team
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yaml@8f4f34686f377813a01b2ca70e4ee4657867d37b
- Trigger Event: release

its-hub 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

its-hub: A Python library for inference-time scaling

📚 Documentation

Installation

Core Installation (Algorithms Only)

With Language Model Support

With Experimental Algorithms

Development Installation

Quick Start

Example 1: Gateway Integration (Core Installation)

Example 2: Standalone Use with OpenAI-Compatible LM

Example 3: Best-of-N with LLM Judge

Key Features

Coding Agent Plugin

After Installing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`its-hub`: A Python library for inference-time scaling