Drop-in replacement for ibm-watsonx-ai that fixes vLLM bugs with gpt-oss-120b/20b models

These details have not been verified by PyPI

Project links

Project description

ibm-watsonx-ai-120b

A drop-in replacement for ibm-watsonx-ai that fixes all known issues with IBM's vLLM-hosted openai/gpt-oss-120b and openai/gpt-oss-20b models.

The Problem

IBM hosts OpenAI's gpt-oss models on WatsonX using vLLM, but the deployment has numerous bugs:

Tool calling doesn't work - tool_calls array is always empty
JSON schema mode is ignored - Model returns free text instead of JSON
Thinking leaks into output - reasoning_content appears without actual content
Streaming breaks with tools - Tool calls appear in wrong fields
Harmony tokens leak - Special tokens like <|channel|> appear in output

The Solution

Change one import and everything works:

# Before (broken)
from ibm_watsonx_ai.foundation_models import ModelInference

# After (fixed!)
from ibm_watsonx_ai_120b.foundation_models import ModelInference

# Your code stays exactly the same
model = ModelInference(
    model_id="openai/gpt-oss-120b",
    credentials=credentials,
    project_id=project_id
)

# Tool calling now works!
response = model.chat(
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }
    }]
)

# JSON schema mode now works!
response = model.chat(
    messages=[{"role": "user", "content": "List 3 colors"}],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "colors",
            "schema": {
                "type": "object",
                "properties": {
                    "colors": {"type": "array", "items": {"type": "string"}}
                },
                "required": ["colors"]
            }
        }
    }
)

Installation

pip install ibm-watsonx-ai-120b

When IBM Fixes Their vLLM

Just change your import back:

# Fixed by IBM - just use the original!
from ibm_watsonx_ai.foundation_models import ModelInference

Your code stays exactly the same because we maintained full API compatibility.

What Gets Fixed

Feature	Original Behavior	With This Package
Tool Calling	`tool_calls=[]` always	Works correctly
JSON Schema	Ignored, returns text	Enforced and validated
Thinking Responses	Empty content, only reasoning	Automatically handled
Streaming + Tools	Tools in wrong field	Falls back to sync
Harmony Tokens	Leak into output	Stripped automatically
Null Content	Crashes vLLM	Converted to empty string

Configuration

from ibm_watsonx_ai_120b import Config

# Adjust retry behavior
Config.max_retries = 5

# Force non-streaming for tools (most reliable)
Config.streaming_tool_strategy = "fallback"

# Enable debug logging
Config.debug = True

Or via environment variables:

export WATSONX_120B_MAX_RETRIES=5
export WATSONX_120B_DISABLE_STREAMING=true
export WATSONX_120B_DEBUG=true

How It Works

The package wraps ibm-watsonx-ai and applies fixes through an adapter pipeline:

MessageAdapter - Fixes null content and tool role issues
ToolAdapter - Emulates tool calling via prompt injection
JSONAdapter - Emulates JSON schema via prompt injection
ThinkingAdapter - Handles reasoning-only responses
HarmonyAdapter - Strips leaked special tokens
StreamAdapter - Handles streaming quirks

Everything else passes through unchanged to the original library.

Requirements

Python 3.9+
ibm-watsonx-ai >= 1.0.0

Documentation

ARCHITECTURE.md - Technical design and issue catalog
TASKS.md - Development roadmap

Contributing

Contributions welcome! See CONTRIBUTING.md for guidelines.

License

MIT License - See LICENSE for details.

Acknowledgments

This package was developed to centralize workarounds originally implemented in:

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.2

Mar 3, 2026

0.1.0

Jan 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ibm_watsonx_ai_120b-0.1.2.tar.gz (69.4 kB view details)

Uploaded Mar 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ibm_watsonx_ai_120b-0.1.2-py3-none-any.whl (29.5 kB view details)

Uploaded Mar 3, 2026 Python 3

File details

Details for the file ibm_watsonx_ai_120b-0.1.2.tar.gz.

File metadata

Download URL: ibm_watsonx_ai_120b-0.1.2.tar.gz
Upload date: Mar 3, 2026
Size: 69.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for ibm_watsonx_ai_120b-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`44ffd2e1afb12a5fbba62d9cfef43ce97a194d9a57b898da3743ebadcb695f0a`
MD5	`3550e10d314627e5cbed97e1b41245bc`
BLAKE2b-256	`cbf37b50dc4f4c02c7ce1860ea231f1ce3ea33e68321f1541ea2023b45976bbb`

See more details on using hashes here.

File details

Details for the file ibm_watsonx_ai_120b-0.1.2-py3-none-any.whl.

File metadata

Download URL: ibm_watsonx_ai_120b-0.1.2-py3-none-any.whl
Upload date: Mar 3, 2026
Size: 29.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for ibm_watsonx_ai_120b-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`381c7888c4445e6087fe5838ad0496b71c3c1c49ff994d9a46f78d1351c4d530`
MD5	`0906a0b49d1814dd2b87c2bdd1d31cf6`
BLAKE2b-256	`ab8c9267689f4a48fbd1381eae3f628870477194675bd2494affe3949816993f`

See more details on using hashes here.

ibm-watsonx-ai-120b 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ibm-watsonx-ai-120b

The Problem

The Solution

Installation

When IBM Fixes Their vLLM

What Gets Fixed

Configuration

How It Works

Requirements

Documentation

Contributing

License

Acknowledgments

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes