Tool-call parsing and normalization for LLM dialects
Project description
Tooletta
Tool-call normalization for LLM dialects.
Tooletta is a small translation layer between model-specific tool-call formats and a canonical ToolCall schema. It is designed to fit the annoying part of model training, distillation, evals, and agent infrastructure: OpenAI, Qwen/Hermes, Mistral, Kimi, DeepSeek, and other model families do not agree on the text they emit for the same semantic tool call.
The center of gravity is Tinker-style training workflows. Tooletta is standalone, but its API is meant to stay easy to slot next to Tinker-compatible renderer, dataset-prep, and distillation code.
teacher trace / model output -> dialect parser -> ToolCall IR -> target renderer -> student format
Why
Tool calls are semantically simple and syntactically annoying. A model may emit <tool_call>{...}</tool_call>, another may use [TOOL_CALLS]..., another may expose OpenAI-compatible JSON objects, and training code still needs one reliable representation.
Tooletta's bet is simple: parse supported dialects at the boundary, keep a tiny canonical IR in the middle, and render only when you need model-specific text again.
That shape is directly inspired by Tinker's cookbook renderers: parse model-specific responses into canonical messages/tool calls, then render the canonical structure into the target model's format. Tooletta starts smaller on purpose: string-level tool-call parsing and rendering first, with tokenizer-aware training helpers considered only after the IR and dialect contract are solid.
Tinker Compatibility
Tinker compatibility is a first-class design target. Tooletta should make it boring to normalize tool calls before they enter a Tinker-style renderer or dataset builder, and to translate tool calls from one model-family dialect into the target format a student model expects.
The package does not depend on Tinker today and does not claim full Tinker renderer parity yet. The near-term goal is to keep the canonical ToolCall contract and parse/render APIs aligned with that style of workflow while the richer training helpers earn their tests.
The current boundary is intentionally boring: ToolCall.to_openai() returns the OpenAI/Tinker-style function-call object, and ToolCall.from_openai(...) accepts both plain mappings and object-style calls with .function.name / .function.arguments.
Scope
Tooletta currently handles:
- parsing supported tool-call dialects into
ToolCallobjects - rendering
ToolCallobjects into supported target dialects - best-effort dialect auto-detection
- custom dialect registration
- a stdin/stdout CLI for normalization and format conversion
Tooletta does not yet handle:
- tokenizer-specific token boundaries
- loss-mask generation
- full chat-template rendering
- streaming parser state
- full training or distillation pipeline orchestration
Install
After the first release:
uv add tooletta
For local development:
uv sync
Quick Start
from tooletta import parse_tool_calls, render_tool_calls
text = '<tool_call>{"name":"search","arguments":{"query":"tool calling"}}</tool_call>'
calls = parse_tool_calls(text, dialect="hermes")
print(calls[0].name)
print(calls[0].arguments)
print(render_tool_calls(calls, dialect="kimi"))
## Calling: search
{"query":"tool calling"}
CLI
Normalize a Qwen/Hermes-style tool call to canonical JSON:
printf '<tool_call>{"name":"search","arguments":{"query":"python"}}</tool_call>' \
| uv run tooletta --from hermes
Render to another dialect:
printf '<tool_call>{"name":"search","arguments":{"query":"python"}}</tool_call>' \
| uv run tooletta --from hermes --to kimi
Built-In Dialects
| Dialect | Aliases | Accepted shape | Rendered shape |
|---|---|---|---|
canonical |
json, tooletta |
{"name": "...", "arguments": {...}} or a list of those objects |
compact JSON list of canonical tool-call objects |
openai |
oai |
OpenAI/Tinker-style tool_calls and legacy function_call, including object-style .function.name / .function.arguments calls |
compact JSON list of OpenAI-compatible function tool-call objects |
hermes |
qwen, nous, nous-hermes |
<tool_call>{...}</tool_call> blocks |
<tool_call> blocks with canonical {name, arguments} payloads |
mistral |
none | [TOOL_CALLS] followed by an embedded JSON object or list payload |
[TOOL_CALLS] plus a compact JSON list |
deepseek |
deepseek-v3 |
DeepSeek-style <|tool▁calls▁begin|>...<|tool▁calls▁end|> blocks |
the same DeepSeek tool-call block style |
kimi |
moonshot |
simplified ## Calling: name blocks followed by JSON arguments |
## Calling: name blocks |
kimi-k2 |
kimi_k2, moonshot-k2 |
Kimi K2 `< | tool_calls_section_begin |
parse_tool_calls(..., dialect="auto") uses a stable built-in dialect order. Custom dialects are supported for explicit parsing/rendering by name; keep auto-detection for unambiguous built-ins.
Development
uv run ruff check .
uv run ruff format --check .
uv run mypy src tests
uv run pytest
uv run pre-commit run --all-files
Run the live Tinker smoke only when you intentionally want a real API call:
TOOLETTA_RUN_LIVE_TINKER=1 \
uv run --env-file .env --with 'tinker>=0.9.0' pytest tests/test_live_tinker_smoke.py -q
By default it uses meta-llama/Llama-3.2-1B, rank 1, and a single training datum built from Tooletta-rendered tool-call text.
Status
Pre-alpha. The first goal is a fast, dependency-free runtime core that nails the IR, parser behavior, renderer behavior, and dialect plugin surface before growing a larger model-format zoo or training-specific APIs.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tooletta-0.1.0.tar.gz.
File metadata
- Download URL: tooletta-0.1.0.tar.gz
- Upload date:
- Size: 9.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e9c74ec5703f82fa0439128aab1b1e232745f125de6eaff748b3a961a94307a
|
|
| MD5 |
10b2c547dc4e5bf676e6831f4e639142
|
|
| BLAKE2b-256 |
ad65bf8ba66e6680c04f98fe9582fbb103f8cccd28b912cfc1280fa042a10929
|
File details
Details for the file tooletta-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tooletta-0.1.0-py3-none-any.whl
- Upload date:
- Size: 12.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9ca17617ed8d76f9e50700599f5478db9cbb10cc93b3ddeca359f9178a1f54b0
|
|
| MD5 |
e4083d21c7016dacfff08d590ebe805a
|
|
| BLAKE2b-256 |
69566fc27065ec46e6a0de7079c8ee605a7ca1d72fdd64f962eefedcc4ef8099
|