SGLang model provider for Strands Agents SDK with Token-In/Token-Out (TITO) support for agentic RL training.
Project description
strands-sglang
SGLang model provider for Strands Agents SDK with Token-In/Token-Out (TITO) support for agentic RL training.
Features
- SGLang Native API: Uses SGLang's native
/generateendpoint for efficient token-level generation - TITO Support: Tracks complete token trajectories with logprobs for RL training - no retokenization drift (see examples/retokenization_drift/)
- Tool Call Parsing: Customizable tool parsing aligned with model chat templates (Hermes/Qwen format)
- Iteration Limiting: Built-in hook to limit tool iterations with clean trajectory truncation
Requirements
- Python 3.10+
- Strands Agents SDK 1.7.0+
- SGLang server running with your model
- HuggingFace tokenizer for the model
Installation
pip install strands-agents strands-sglang
Or install from source with development dependencies:
git clone https://github.com/anthropics/strands-sglang.git
cd strands-sglang
pip install -e ".[dev]"
Quick Start
1. Start SGLang Server
python -m sglang.launch_server \
--model-path Qwen/Qwen3-4B-Thinking-2507 \
--port 8000 \
--host 0.0.0.0
Tips: There's no need to load SGLang's tool parser because this is for training
2. Basic Agent Usage
import asyncio
from transformers import AutoTokenizer
from strands import Agent
from strands_tools import calculator
from strands_sglang import SGLangModel
async def main():
# Initialize model with tokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-4B-Thinking-2507")
model = SGLangModel(
tokenizer=tokenizer,
base_url="http://localhost:8000",
model_id="Qwen/Qwen3-4B-Thinking-2507",
)
# Create agent with tools
agent = Agent(
model=model,
tools=[calculator],
system_prompt="You are a helpful math assistant. Use the calculator for all arithmetic.",
)
# Run episode
model.reset() # Reset TITO state for new episode
result = await agent.invoke_async("What is 25 * 17?")
print(result)
# Access TITO data for RL training
print(f"Trajectory: {len(model.token_manager)} tokens")
print(f"Output tokens: {sum(model.token_manager.loss_mask)}")
asyncio.run(main())
RL Training with Slime
For RL training with Slime, run async rollout:
async def generate(args, sample: Sample, sampling_params) -> Sample:
...
# The whole agent loop logic in a few lines
url = f"http://{args.sglang_router_ip}:{args.sglang_router_port}/generate"
model = SGLangModel(tokenizer=tokenizer, base_url=url)
limiter = ToolIterationLimiter(max_iterations=5) # Optional: control maximum tool iteration
agent = Agent(model=model, tools=[calculator], hooks[limiter], system_prompt="...")
try:
await agent.invoke_async(sample.prompt)
sample.status = Sample.Status.COMPLETED
except Exception as e:
# Use exception to determine TRUNCATED or ABORTED
...
# Use model.token_manager to fill in sample's attributes
sample.tokens = model.token_manager.token_ids
sample.loss_mask = model.token_manager.loss_mask
sample.rollout_log_probs = model.token_manager.logprobs
...
A concrete example at Slime's repository will be available later.
Configuration
SGLangModel Options
model = SGLangModel(
tokenizer=tokenizer, # Required: HuggingFace tokenizer
base_url="http://localhost:8000", # SGLang server URL
model_id="Qwen/Qwen3-4B-Thinking-2507", # Optional: model identifier
tool_call_parser=HermesToolCallParser(), # Tool call format parser
params={ # Sampling parameters
"max_new_tokens": 1024,
"temperature": 0.7,
"top_p": 0.9,
},
timeout=300.0, # Request timeout in seconds
return_logprobs=True, # Return logprobs (default: True)
)
See more sampling params options at SGLang's documentation.
Testing
Unit Tests
pytest tests/unit/ -v
Integration Tests
Requires a running SGLang server:
# Start server first
python -m sglang.launch_server --model-path Qwen/Qwen3-4B-Thinking-2507 --port 8000
# Run tests
pytest tests/integration/ -v \
--sglang-base-url=http://localhost:8000 \
--sglang-model-id=Qwen/Qwen3-4B-Thinking-2507
Or configure via environment variables:
export SGLANG_BASE_URL=http://localhost:8000
export SGLANG_MODEL_ID=Qwen/Qwen3-4B-Thinking-2507
pytest tests/integration/ -v
Contributing
pip install -e ".[dev]"
pre-commit install
Now git commit will auto-run linting and formatting checks.
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file strands_sglang-0.0.1.tar.gz.
File metadata
- Download URL: strands_sglang-0.0.1.tar.gz
- Upload date:
- Size: 45.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
32e4fd505375368319ae12ed9e5e792738ae503455da6d1cf1a615be28a82df7
|
|
| MD5 |
e799ee537aeb93c593ae5443574a3ec7
|
|
| BLAKE2b-256 |
4124622314319d228f2f826193cf554b27e5b6bbc521cd57edd931536e8530ea
|
Provenance
The following attestation bundles were made for strands_sglang-0.0.1.tar.gz:
Publisher:
publish.yml on horizon-rl/strands-sglang
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
strands_sglang-0.0.1.tar.gz -
Subject digest:
32e4fd505375368319ae12ed9e5e792738ae503455da6d1cf1a615be28a82df7 - Sigstore transparency entry: 789480067
- Sigstore integration time:
-
Permalink:
horizon-rl/strands-sglang@b9c06637e1c5596acd29d3c5059f6865dd33c961 -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/horizon-rl
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@b9c06637e1c5596acd29d3c5059f6865dd33c961 -
Trigger Event:
release
-
Statement type:
File details
Details for the file strands_sglang-0.0.1-py3-none-any.whl.
File metadata
- Download URL: strands_sglang-0.0.1-py3-none-any.whl
- Upload date:
- Size: 20.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4afb471424afc0a918a319e02950760dca9eb26af7cae4136bd1638181578bcb
|
|
| MD5 |
563c516cb844bb8deffd0206dffcfc2b
|
|
| BLAKE2b-256 |
7bbc9d45cb65a860eef5ed348589884efa8d9faac5d6362d1483ab7a1ae20882
|
Provenance
The following attestation bundles were made for strands_sglang-0.0.1-py3-none-any.whl:
Publisher:
publish.yml on horizon-rl/strands-sglang
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
strands_sglang-0.0.1-py3-none-any.whl -
Subject digest:
4afb471424afc0a918a319e02950760dca9eb26af7cae4136bd1638181578bcb - Sigstore transparency entry: 789480071
- Sigstore integration time:
-
Permalink:
horizon-rl/strands-sglang@b9c06637e1c5596acd29d3c5059f6865dd33c961 -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/horizon-rl
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@b9c06637e1c5596acd29d3c5059f6865dd33c961 -
Trigger Event:
release
-
Statement type: