Skip to main content

A Python library for inference-time scaling LLMs

Project description

its-hub: A Python library for inference-time scaling

Tests codecov PyPI version

its_hub is a Python library for inference-time scaling of LLMs, focusing on mathematical reasoning tasks.

📚 Documentation

For comprehensive documentation, including installation guides, tutorials, and API reference, visit:

https://ai-innovation.team/its_hub

Installation

its_hub provides a minimal core focused on algorithms, with optional language model implementations.

Core Installation (Algorithms Only)

For gateway integration - just algorithms and interfaces, minimal dependencies:

pip install its_hub

This includes:

  • ✓ Self-Consistency and Best-of-N algorithms
  • ✓ Abstract base classes (AbstractLanguageModel, AbstractOutcomeRewardModel)
  • ✓ Only 2 dependencies: numpy, typing-extensions

With Language Model Support

For standalone use - includes OpenAI-compatible language model implementation:

pip install its_hub[lm]

Adds: OpenAICompatibleLanguageModel, LLMJudge, StepGeneration (requires openai, aiohttp, backoff)

With Experimental Algorithms

For experimental features - includes beam search and particle filtering:

pip install its_hub[experimental]

Adds: Process reward models, beam search, particle filtering algorithms

Development Installation

git clone https://github.com/Red-Hat-AI-Innovation-Team/its_hub.git
cd its_hub
pip install -e ".[dev]"
# or using uv:
uv sync --extra dev

Quick Start

Example 1: Gateway Integration (Core Installation)

Installation required: pip install its_hub (core only, minimal dependencies)

Gateway integration requires implementing two interfaces: AbstractLanguageModel for LM calls and AbstractOrchestrator for managing parallel execution with concurrency control and rate limiting.

import asyncio

from its_hub import AbstractLanguageModel, AbstractOrchestrator, SelfConsistency

# Step 1: Implement AbstractLanguageModel with your gateway's LM client
class MyGatewayLM(AbstractLanguageModel):
    def __init__(self, gateway_client):
        self.client = gateway_client

    async def agenerate_single(self, messages, stop=None, **kwargs):
        response = await self.client.generate(messages, stop=stop, **kwargs)
        return {"role": "assistant", "content": response}

# Step 2: Implement AbstractOrchestrator for concurrency control
# (or use the built-in LMOrchestrator from its_hub[lm])
class MyGatewayOrchestrator(AbstractOrchestrator):
    async def agenerate(self, lm, messages_lst, **kwargs):
        # Manage parallel calls with your gateway's rate limits
        ...

async def main():
    lm = MyGatewayLM(your_gateway_client)
    orchestrator = MyGatewayOrchestrator()
    algorithm = SelfConsistency(orchestrator=orchestrator)
    result = await algorithm.ainfer(lm, "What is 2+2?", budget=5)
    print(result)  # {"role": "assistant", "content": "4", ...}

asyncio.run(main())

The AbstractOrchestrator is the central coordination point — it controls how algorithms fan out parallel LM calls, enforces rate limits, and provides structured error handling. See Orchestration for details.

Example 2: Standalone Use with OpenAI-Compatible LM

Installation required: pip install its_hub[lm]

import asyncio

from its_hub import OpenAICompatibleLanguageModel, SelfConsistency

lm = OpenAICompatibleLanguageModel(
    endpoint="https://api.openai.com/v1",
    api_key="your-api-key",
    model_name="gpt-4o-mini",
)

algorithm = SelfConsistency()
result = algorithm.infer(lm, "What is the capital of France?", budget=3)
print(result)  # Most common answer from 3 generations

# Close lm for resource cleanup
asyncio.run(lm.close())

Example 3: Best-of-N with LLM Judge

Installation required: pip install its_hub[lm]

import asyncio

from its_hub import BestOfN, LLMJudge, OpenAICompatibleLanguageModel

lm = OpenAICompatibleLanguageModel(
    endpoint="https://api.openai.com/v1",
    api_key="your-api-key",
    model_name="gpt-4o-mini",
)

judge = LLMJudge(lm=lm, fallback_score=5.0)
algorithm = BestOfN(orm=judge)
result = algorithm.infer(lm, "Write a sorting function", budget=5)
print(result)  # Best response as judged by LLM

# Close lm for resource cleanup
asyncio.run(lm.close())

Key Features

  • 🔬 Multiple Algorithms: Self-Consistency, Best-of-N, Beam Search (experimental), Particle Filtering (experimental)
  • 🚀 Gateway Integration: Clean abstractions (AbstractLanguageModel, AbstractOrchestrator) for easy integration with AI gateways
  • 🔄 Orchestration: AbstractOrchestrator provides structured concurrency, rate limiting, and error propagation for parallel LM calls — essential for production gateway deployments
  • 🧮 Math-Optimized: Built for mathematical reasoning tasks
  • Async-First: ainfer() is the primary method; infer() is a sync wrapper. Concurrent generation with limits and error handling
  • 🎯 Minimal Core: Only 2 dependencies (numpy, typing-extensions) for core install

For detailed documentation, visit: https://ai-innovation.team/its_hub

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

its_hub-1.0.0.tar.gz (121.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

its_hub-1.0.0-py3-none-any.whl (48.1 kB view details)

Uploaded Python 3

File details

Details for the file its_hub-1.0.0.tar.gz.

File metadata

  • Download URL: its_hub-1.0.0.tar.gz
  • Upload date:
  • Size: 121.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for its_hub-1.0.0.tar.gz
Algorithm Hash digest
SHA256 32a7c20cdf4853f6959a0c7b48be305dad8a534a55f52fb6e1194e3aa4f1dc3b
MD5 4c161d4335a715e16ff42a2228b370ce
BLAKE2b-256 4a3ba3ef1d7407ee1d324ad217f5249d22652ce8d0b454b83592704e094ca8ab

See more details on using hashes here.

Provenance

The following attestation bundles were made for its_hub-1.0.0.tar.gz:

Publisher: release.yaml on Red-Hat-AI-Innovation-Team/its_hub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file its_hub-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: its_hub-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 48.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for its_hub-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1e9e78049508fb07a15f0ebeebd57dbcf66bcf026ca21f283dbca42e021a72f4
MD5 ec08fdbe9b2b35e6188608fb3469a415
BLAKE2b-256 82b1b93492dc4671339fb594dfa1641a01a8f0dc102523a40c98f751ff31fee7

See more details on using hashes here.

Provenance

The following attestation bundles were made for its_hub-1.0.0-py3-none-any.whl:

Publisher: release.yaml on Red-Hat-AI-Innovation-Team/its_hub

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page