Skip to main content

A helper library for LLM/VLM chat templates.

Project description

🧩 Chat Bricks

Static Badge

Compose chat templates from typed bricks. Train with labels and action_mask you can trust.

Chat Bricks is a chat-template toolkit for LLM/VLM training and inference, built on two ideas:

  1. A template is a composition of small, typed parts — system/user/assistant blocks, section templates ({tools}, {skills}), policies, formatters, content processors, joiners. Swap any of them without rewriting Jinja.
  2. A template should be verifiable — rendering is checked byte-for-byte against the model's official apply_chat_template output, and chat.tokenize(...) returns per-token labels and action_mask ready to drop into an SFT or RL loss.

A quick taste

Define a template by composing bricks:

from chat_bricks import (
    Chat, Template, ToolPolicy, ToolPlacement, JsonIndentedFormatter,
)

template = Template(
    name="my-agent",
    system_template="<|im_start|>system\n{system_message}{tools}<|im_end|>\n",
    system_message="You are a careful agent.",
    tools_template="\n\n# Tools\n{tools}",
    user_template="<|im_start|>user\n{content}<|im_end|>\n",
    assistant_template="<|im_start|>assistant\n{content}<|im_end|>\n",
    tool_policy=ToolPolicy(
        placement=ToolPlacement.SYSTEM,
        formatter=JsonIndentedFormatter(indent=2, joiner="\n\n"),
    ),
    stop_words=["<|im_end|>"],
)

tools = [{"type": "function", "function": {
    "name": "multiply",
    "description": "Multiply two numbers",
    "parameters": {
        "type": "object",
        "properties": {"x": {"type": "number"}, "y": {"type": "number"}},
        "required": ["x", "y"],
    },
}}]

chat = Chat(template=template,
            messages=[{"role": "user", "content": "What's 3 times 5?"}],
            tools=tools)
print(chat.prompt())

Renders:

<|im_start|>system
You are a careful agent.

# Tools
{
  "type": "function",
  "function": {
    "name": "multiply",
    "description": "Multiply two numbers",
    "parameters": {
      "type": "object",
      "properties": { "x": {"type": "number"}, "y": {"type": "number"} },
      "required": ["x", "y"]
    }
  }
}<|im_end|>
<|im_start|>user
What's 3 times 5?<|im_end|>

Every visible piece of that output — section ordering, the tool-block wrapper, the JSON indent, the role markers — came from a brick you can substitute. Want minified tools instead? Swap the formatter. Want tools after the user turn? Change the placement. Want a different role layout? Change system_template / user_template / assistant_template. Nothing rewrites the template engine.

Two ways to define a template

Compose your own — typed bricks, as above. Bring your conventions, mix and match.

Or use any HuggingFace model directly:

from chat_bricks import Chat

chat = Chat(template="Qwen/Qwen2.5-3B-Instruct", messages=[...])
# Falls back to the model's tokenizer.chat_template; masking is reconstructed
# from incremental renders so you still get correct labels + action_mask.

Both paths share the same Chat API, the same tokenizer integration, and the same correctness guarantees.

Verified rendering + ready-to-train tensors

from transformers import AutoTokenizer
from chat_bricks import Chat

tok = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B-Instruct")
chat = Chat(template="Qwen/Qwen2.5-3B-Instruct", messages=[
    {"role": "user", "content": "What's 3 times 5?"},
    {"role": "assistant", "content": "15."},
    {"role": "user", "content": "Now plus 2?"},
    {"role": "assistant", "content": "17."},
])

inputs = chat.tokenize(tok)
# inputs["input_ids"]      — token IDs
# inputs["labels"]         — -100 except assistant turns; drop into SFT loss
# inputs["action_mask"]    — 1 on assistant tokens, 0 elsewhere
# inputs["attention_mask"] — standard

The mask isn't a string-offset hack — it's reconstructed by aligning incremental renders to token spans, with model-specific overrides for templates that aren't append-only (e.g. Qwen3 drops previous thinking blocks). For the conversation above, action_mask flags exactly the tokens that compose "15." and "17." — nothing more.

Want to see the mask? Use chat.prompt_with_mask() to print the prompt with assistant spans color-highlighted in the terminal.

What you get

Composable template architecture

  • Typed bricks: Template, ToolPolicy, SystemPolicy, SkillPolicy, GlobalPolicy.
  • Pluggable ToolFormatter (Qwen-style, JSON variants, YAML, custom) — swap conventions without touching Jinja.
  • Two-pass section system: {tools} / {skills} placeholders, wrapper templates, per-item templates with joiners. Add a new section type in a few lines.
  • Content processors for per-section transforms (truncate descriptions, filter tools by category, inject env metadata, Llama-3.2-style date stamping).
  • Export to Jinja via template.jinja_template() for HF tokenizer.chat_template compatibility.

Verifiable training-time correctness

  • Per-token labels and action_mask across multi-turn, tool-call, and skill turns.
  • Byte-identical rendering vs. the official template, checked via compare_hf_template(...) and CI on every push.
  • Chat(template="org/model") works with any HuggingFace repo; correctness escape hatches (Qwen3Renderer-style overrides) for non-append-only families.
  • VLM support: vision-language templates and a registerable vision processor.

Installation

pip install chat-bricks

More examples

Same base model, different tool conventions

Pick a built-in variant for the convention you want — no Jinja rewrites:

from chat_bricks import Chat

# Tools rendered into the system prompt (Qwen's default)
Chat(template="qwen2.5", messages=..., tools=tools)

# Tools not advertised in the system prompt (describe them yourself)
Chat(template="qwen2.5-no-system-tool", messages=..., tools=tools)

Or roll your own with ToolPolicy + ToolFormatter — see docs/how_to_use/tools.md.

A custom tool formatter, end-to-end

from chat_bricks import ToolFormatter

class XmlToolFormatter(ToolFormatter):
    def format(self, tools):
        out = []
        for t in tools:
            fn = t["function"] if "function" in t else t
            out.append(f'<tool name="{fn["name"]}">{fn.get("description","")}</tool>')
        return "\n".join(out)

    def jinja(self):  # so the same template exports cleanly to HF
        return (
            "{%- for t in tools -%}"
            '<tool name="{{ (t.function if t.function is defined else t).name }}">'
            "{{ (t.function if t.function is defined else t).description }}"
            "</tool>{%- if not loop.last %}\n{% endif %}"
            "{%- endfor -%}"
        )

Drop it into any template via ToolPolicy(formatter=XmlToolFormatter()).

Skills + tools in the same template

The built-in qwen-skills template advertises a skills catalogue alongside tools:

chat = Chat(
    template="qwen-skills",
    messages=[{"role": "user", "content": "Help me count words."}],
    tools=[{"type": "function", "function": {"name": "load_skill", ...}}],
    skills=[
        {"name": "add-numbers", "description": "Adds two integers."},
        {"name": "word-count",  "description": "Counts words in text."},
    ],
)

The skills block lives at {skills} in system_template, wrapped by skills_template, with each entry formatted by SkillPolicy.single_skill_template. See docs/how_to_use/skills.md.

Train on the last assistant turn only

inputs = chat.tokenize(tok, train_on_last_turn_only=True)
# Only the final assistant turn contributes to the loss.
# Useful for RL rollouts or when earlier turns are demonstrations.

Verify a template before training

from chat_bricks.utils import compare_hf_template

is_equal, *_ = compare_hf_template(
    tok, "qwen2.5",
    messages=[...], tools=[...], add_generation_prompt=True,
)
assert is_equal, "Built-in render diverges from the model's official template"

compare_hf_template also checks that the exported Jinja round-trips to the same string — so a template you defined in Python will produce identical output when handed to any HF inference server. See docs/how_to_use/verification.md.

Documentation

Full docs at docs/index.md, or run mkdocs serve locally.

Recommended starting points:

Community

WeChat Discord

Scan to join wechat group

Join our discord channel

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chat_bricks-0.1.4.tar.gz (55.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chat_bricks-0.1.4-py3-none-any.whl (58.8 kB view details)

Uploaded Python 3

File details

Details for the file chat_bricks-0.1.4.tar.gz.

File metadata

  • Download URL: chat_bricks-0.1.4.tar.gz
  • Upload date:
  • Size: 55.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for chat_bricks-0.1.4.tar.gz
Algorithm Hash digest
SHA256 67336a33cdc8136880034f68f1f2d58fa93e5f8af4f4c97c4446771504097e8b
MD5 578b1356a8a0478a0cae6b7fab3e31f9
BLAKE2b-256 5551f1acecf0a1d5828121853f1f4c6adb9954650d121f466ee546d8314d598c

See more details on using hashes here.

File details

Details for the file chat_bricks-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: chat_bricks-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 58.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for chat_bricks-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 c4b0f35f020eb8b341ab673cfcb4a8a70daf435e5588909ecd0360a3ad1f8397
MD5 f3dc1424f7eb7685d8aaeb961c797815
BLAKE2b-256 ab2371ae562108561b95415fdd008d2374c156a5de8ba8c7e6031972ee6e6b00

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page