Skip to main content

An integration package connecting Qwen 3, QwQ and LangChain

Project description

langchain-qwq

This package provides seamless integration between LangChain and QwQ models as well as other Qwen series models from Alibaba Cloud BaiLian (via OpenAI-compatible API).

Features

  • QwQ Model Integration: Full support for QwQ models with advanced reasoning capabilities
  • Qwen3 Model Integration: Comprehensive support for Qwen3 series models with hybrid reasoning modes
  • Other Qwen Models: Compatibility with Qwen-Max, Qwen2.5, and other Qwen series models
  • Vision Models: Native support for Qwen-VL series vision models
  • Streaming Support: Synchronous and asynchronous streaming capabilities
  • Tool Calling: Function calling with support for parallel execution
  • Structured Output: JSON mode and function calling for structured response generation
  • Reasoning Access: Direct access to internal model reasoning and thinking content

Installation

To install the package:

pip install -U langchain-qwq

If you want to install additional dependencies for development:

pip install -U langchain-qwq[test]
pip install -U langchain-qwq[codespell]
pip install -U langchain-qwq[lint]
pip install -U langchain-qwq[typing]

Note: The documentation notebooks in docs/ can be viewed directly on GitHub or in VS Code without additional dependencies. To run them interactively, install Jupyter separately: pip install jupyterlab

Environment Variables

Authentication and configuration are managed through the following environment variables:

  • DASHSCOPE_API_KEY: Your DashScope API key (required)
  • DASHSCOPE_API_BASE: Optional API base URL (defaults to "https://dashscope-intl.aliyuncs.com/compatible-mode/v1")

Note: Domestic Chinese users should configure DASHSCOPE_API_BASE to the domestic endpoint, as langchain-qwq defaults to the international Alibaba Cloud endpoint.

ChatQwQ

The ChatQwQ class provides access to QwQ chat models with built-in reasoning capabilities.

Basic Usage

from langchain_qwq import ChatQwQ

model = ChatQwQ(model="qwq-plus")
response = model.invoke("Hello, how are you?")
print(response.content)

Accessing Reasoning Content

You can access the internal reasoning content of QwQ models via additional_kwargs:

response = model.invoke("Hello")
content = response.content
reasoning = response.additional_kwargs.get("reasoning_content", "")
print(f"Response: {content}")
print(f"Reasoning: {reasoning}")

Streaming

Sync Streaming

model = ChatQwQ(model="qwq-plus")

is_first = True
is_end = True

for msg in model.stream("Hello"):
    if hasattr(msg, 'additional_kwargs') and "reasoning_content" in msg.additional_kwargs:
        if is_first:
            print("Starting to think...")
            is_first = False
        print(msg.additional_kwargs["reasoning_content"], end="", flush=True)
    elif hasattr(msg, 'content') and msg.content:
        if is_end:
            print("\nThinking ended")
            is_end = False
        print(msg.content, end="", flush=True)

Async Streaming

is_first = True
is_end = True

async for msg in model.astream("Hello"):
    if hasattr(msg, 'additional_kwargs') and "reasoning_content" in msg.additional_kwargs:
        if is_first:
            print("Starting to think...")
            is_first = False
        print(msg.additional_kwargs["reasoning_content"], end="", flush=True)
    elif hasattr(msg, 'content') and msg.content:
        if is_end:
            print("\nThinking ended")
            is_end = False
        print(msg.content, end="", flush=True)

Using Content Blocks

ChatQwQ also supports v1 version content_blocks, for example

from langchain_qwq import ChatQwen, ChatQwQ
model = ChatQwQ(model="qwq-plus")
print(model.invoke("Hello").content_blocks)

Tool Calling

from langchain_core.tools import tool

@tool
def get_weather(city: str) -> str:
    """Get the weather for a city"""
    return f"The weather in {city} is sunny."

bound_model = model.bind_tools([get_weather])
response = bound_model.invoke("What's the weather in New York?")
print(response.tool_calls)

Structured Output

JSON Mode

from pydantic import BaseModel

class User(BaseModel):
    name: str
    age: int

struct_model = model.with_structured_output(User, method="json_mode")
response = struct_model.invoke("Hello, I'm John and I'm 25 years old")
print(response)  # User(name='John', age=25)

Function Calling Mode

struct_model = model.with_structured_output(User, method="function_calling")
response = struct_model.invoke("My name is Alice and I'm 30")
print(response)  # User(name='Alice', age=30)

Integration with LangChain Agents

from langchain.agents import create_agent
from langchain_core.messages import HumanMessage
from langchain_core.tools import tool
from langchain_qwq import ChatQwen, ChatQwQ


@tool
def get_weather(city: str) -> str:
    """Get the weather in a city."""
    return f"The weather in {city} is sunny."


model = ChatQwQ(model="qwq-plus")
agent = create_agent(model, tools=[get_weather])
print(agent.invoke({"messages": [HumanMessage("What is the weather like in New York?")]}))

QvQ Model Example

from langchain_core.messages import HumanMessage

messages = [
    HumanMessage(
        content_blocks=[
            {
                "type": "image",
                "url": "https://www.example.com/image.jpg",
            },
            {"type": "text", "text": "describe the image"},
        ]
    )
]


# model = ChatQwen(model="qwen-plus-latest")
model = ChatQwQ(model="qvq-plus")
print(model.invoke(messages))

ChatQwen

The ChatQwen class offers enhanced support for Qwen3 and other Qwen series models, including specialized parameters for Qwen3's thinking mode.

Basic Usage

from langchain_qwq import ChatQwen

# Qwen3 model
model = ChatQwen(model="qwen3-235b-a22b-instruct-2507")
response = model.invoke("Hello")
print(response.content)

model=ChatQwen(model="qwen3-235b-a22b-thinking-2507")
response=model.invoke("Hello")
# Access reasoning content (Qwen3 only)
reasoning = response.additional_kwargs.get("reasoning_content", "")
print(f"Reasoning: {reasoning}")

Thinking Control

Note: This feature is only applicable to Qwen3 models. It applies to all Qwen3 models except the latest ones, including but not limited to Qwen3-235b-a22b-thinking-2507, Qwen3-235b-a22b-instruct-2507, Qwen3-Coder-480B-a35b-instruct, and Qwen3-Coder-plus.

Disable Thinking Mode

# Disable thinking for open-source Qwen3 models
model = ChatQwen(model="qwen3-32b", enable_thinking=False)
response = model.invoke("Hello")
print(response.content)  # No reasoning content

Enable Thinking for Proprietary Models

# Enable thinking for proprietary models
model = ChatQwen(model="qwen-plus-latest", enable_thinking=True)
response = model.invoke("Hello")
reasoning = response.additional_kwargs.get("reasoning_content", "")
print(f"Reasoning: {reasoning}")

Control Thinking Length

# Set thinking budget (max thinking tokens)
model = ChatQwen(model="qwen3-32b", thinking_budget=20)
response = model.invoke("Hello")
reasoning = response.additional_kwargs.get("reasoning_content", "")
print(f"Limited reasoning: {reasoning}")

Other Qwen Models

Qwen-Max

model = ChatQwen(model="qwen-max-latest")
print(model.invoke("Hello").content)

# Tool calling
bound_model = model.bind_tools([get_weather])
response = bound_model.invoke("Weather in Shanghai and Beijing?", parallel_tool_calls=True)
print(response.tool_calls)

# Structured output
struct_model = model.with_structured_output(User, method="json_mode")
result = struct_model.invoke("I'm Bob, 28 years old")
print(result)

Qwen2.5-72B

model = ChatQwen(model="qwen2.5-72b-instruct")
print(model.invoke("Hello").content)

# All features work the same as other models
bound_model = model.bind_tools([get_weather])
struct_model = model.with_structured_output(User, method="json_mode")

Using Content Blocks

ChatQwen supports content blocks, for example

from langchain_qwq import ChatQwen

model = ChatQwen(model="qwen-plus-latest",enable_thinking=True)
print(model.invoke("Hello").content_blocks)

Integration with LangChain Agents

from langchain.agents import create_agent
from langchain_core.messages import HumanMessage
from langchain_core.tools import tool
from langchain_qwq import ChatQwen


@tool
def get_weather(city: str) -> str:
    """Get the weather in a city."""
    return f"The weather in {city} is sunny."


model = ChatQwen(model="qwen-plus-latest")
agent = create_agent(model, tools=[get_weather])
print(agent.invoke({"messages": [HumanMessage("查询New York的天气")]}))

Vision Models

from langchain_core.messages import HumanMessage
from langchain_qwq import ChatQwen


messages = [
    HumanMessage(
        content_blocks=[
            {
                "type": "image",
                "url": "https://www.example.com/image.jpg",
            },
            {"type": "text", "text": "描述图片内容"},
        ]
    )
]


model = ChatQwen(model="qwen3-vl-plus")
print(model.invoke(messages))

Middleware

This library provides a middleware DashScopeContextCacheMiddleware that can be used to create display caching.

Example usage

from langchain.agents import create_agent
from langchain.tools import tool

from langchain_qwq import ChatQwen
from langchain_qwq.middleware import DashScopeContextCacheMiddleware


@tool
def get_weather(city: str) -> str:
    """Get the current weather in a given city."""
    return f"The weather in {city} is 20 degrees Celsius."


model = ChatQwen(model="qwen3-max")

agent = create_agent(
    model, tools=[get_weather], middleware=[DashScopeContextCacheMiddleware()]
)

Model Comparison

Feature ChatQwQ ChatQwen
QwQ Models ✅ Primary
QvQ Models ✅ Primary
Qwen3 Models ✅ Basic ✅ Enhanced
Other Qwen Models ✅ Full Support
Vision Models ✅ Supported
Thinking Control ✅ (Qwen3 only)
Thinking Budget ✅ (Qwen3 only)

Usage Guidance

  • Use ChatQwQ for QwQ and QvQ models.
  • For Qwen3 series models (available only on Alibaba Cloud BAILIAN platform) with deep thinking mode enabled, all invocations will automatically use streaming.
  • For other Qwen series models (including self-deployed or third-party deployed Qwen3 models), use ChatQwen, and streaming will not be automatically enabled.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_qwq-0.3.4.tar.gz (18.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_qwq-0.3.4-py3-none-any.whl (17.8 kB view details)

Uploaded Python 3

File details

Details for the file langchain_qwq-0.3.4.tar.gz.

File metadata

  • Download URL: langchain_qwq-0.3.4.tar.gz
  • Upload date:
  • Size: 18.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.13.0rc3 Darwin/25.2.0

File hashes

Hashes for langchain_qwq-0.3.4.tar.gz
Algorithm Hash digest
SHA256 5156c1f6c5082d1cb8e509b912e4184182baf15f0e3cab66ff9ad62ce144bf77
MD5 29a75f6b7ba16e60a432d63767859844
BLAKE2b-256 a08912444f63b4b9a1b4df3a2338c09c3b0650bc99b377a621815353142553b0

See more details on using hashes here.

File details

Details for the file langchain_qwq-0.3.4-py3-none-any.whl.

File metadata

  • Download URL: langchain_qwq-0.3.4-py3-none-any.whl
  • Upload date:
  • Size: 17.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.13.0rc3 Darwin/25.2.0

File hashes

Hashes for langchain_qwq-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 d04d5e1803fb694d1bb513dcbf83d7d6b04069f934786e05a706d32d9324251c
MD5 13f58607cfbf603a7e48445f6046bb2f
BLAKE2b-256 d688c7b1b2ee8cfd3f11eec50a1a8130b9a01ba2b41cce428e9ed9cb02e26b10

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page