Streaming-safe stripping of <think> blocks from reasoning model output

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

thinkstrip — Streaming Think-Block Stripper for LLM Output

thinkstrip removes <think>...</think> blocks from model output in both batch and streaming mode. It is designed for reasoning models (Qwen3, DeepSeek-R1, and others) that emit internal reasoning before their visible answer.

The streaming case is the reason this package exists: tag boundaries can split across adjacent token yields, so a correct implementation needs a stateful rolling buffer instead of a post-generation regex.

Why this exists

Reasoning models can emit output like:

<think>hidden chain of thought</think>The actual answer.

For fully materialized strings, stripping is easy. For token streams, it is not. Partial tags can arrive across multiple adjacent token yields, for example:

<thi then nk>
</thi then nk>

A naive .replace() or regex-per-token approach leaks fragments or drops visible output. thinkstrip solves this with a stateful streaming filter.

Install

Requirements:

Python 3.13+
Zero runtime dependencies — installs and runs with nothing beyond the standard library

pip install thinkstrip

Development install:

pip install -e ".[dev]"

Quick start

Streaming

from thinkstrip import ThinkStrip

stripper = ThinkStrip()
chunks   = []

for token in ['<thi', 'nk>', 'hidden', '</thi', 'nk>', 'The answer.']:
    if emitted := stripper.feed(token):
        chunks.append(emitted)

if flushed := stripper.flush():
    chunks.append(flushed)

print(''.join(chunks))
# The answer.

Async streaming

from thinkstrip import AsyncThinkStrip

stripper = AsyncThinkStrip()
chunks   = []

async for token in model_stream:
    if emitted := await stripper.feed(token):
        chunks.append(emitted)

if flushed := await stripper.flush():
    chunks.append(flushed)

Batch

from thinkstrip import strip_think

clean = strip_think('<think>reasoning</think>The actual answer.')
print(clean)
# The actual answer.

Prompt pre-cleaner

Some GGUF chat templates inject <think> at the end of the rendered prompt before the model generates. This breaks the streaming filter because the model never emits its own <think>. Call strip_think_prefill on the rendered prompt to remove it:

from thinkstrip import strip_think_prefill

prompt = strip_think_prefill(prompt)
# trailing '<think>' removed if present; no-op otherwise

Public API

from thinkstrip import ThinkStrip, AsyncThinkStrip, strip_think, strip_think_prefill

`ThinkStrip`

Stateful streaming filter. Create one instance per response stream.

ThinkStrip(
    open_tag:  str  = '<think>',
    close_tag: str  = '</think>',
    capture:   bool = False,
)

Method / property	Description
`.feed(token: str) -> str`	Process one token. Returns the text to emit (empty string when nothing ready yet).
`.flush() -> str`	Call once at end-of-stream. Returns any buffered visible text. Empty if stream ended inside a think block.
`.think_content: str`	Accumulated think-block text. Non-empty only when `capture=True`.
`.in_think_block: bool`	`True` if the stream ended mid-think-block. Useful for diagnostics.

Constructor parameter	Type	Default	Description
`open_tag`	`str`	`<think>`	Opening tag to strip
`close_tag`	`str`	`</think>`	Closing tag to strip
`capture`	`bool`	`False`	Retain think content in `.think_content` instead of discarding

Buffer sizes are derived automatically: len(open_tag) - 1 chars for the opening-tag guard, len(close_tag) - 1 for the closing-tag guard. Custom tags carry no extra cost.

`AsyncThinkStrip`

Async wrapper around ThinkStrip. Delegates to asyncio.to_thread() — no threading primitives required by the caller. Same constructor signature and properties as ThinkStrip.

Method	Description
`await .feed(token: str) -> str`	Async variant of `ThinkStrip.feed()`
`await .flush() -> str`	Async variant of `ThinkStrip.flush()`

`strip_think`

Stateless helper for complete strings. Implemented via ThinkStrip — batch and streaming behavior are identical.

strip_think(
    text:      str,
    open_tag:  str = '<think>',
    close_tag: str = '</think>',
) -> str

`strip_think_prefill`

Removes a trailing open tag injected by some GGUF chat templates.

strip_think_prefill(
    prompt:   str,
    open_tag: str = '<think>',
) -> str

Capture mode

When capture=True, think content accumulates in .think_content instead of being discarded. Multiple think blocks per response are concatenated. Useful for surfacing the model's reasoning in a separate UI panel or for eval runs.

stripper = ThinkStrip(capture=True)

for token in stream:
    if emitted := stripper.feed(token):
        yield emitted

stripper.flush()

print(stripper.think_content)  # full reasoning text

Limitations

Nested tags are not supported. A <think> that arrives while already inside a think block is treated as think content and swallowed. The first </think> closes the block; any subsequent </think> with no matching open tag passes through as visible text. In practice this does not occur — Qwen3 and DeepSeek-R1 emit exactly one think block per response.

Development

git clone https://github.com/informity/thinkstrip.git
cd thinkstrip

python3 -m venv .venv
source .venv/bin/activate

pip install -e ".[dev]"

make lint
make test
make build

Contributing

See CONTRIBUTING.md.

License

MIT — see LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

informity

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.2

Mar 26, 2026

0.2.1

Mar 26, 2026

0.2.0

Mar 25, 2026

This version

0.1.0

Mar 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thinkstrip-0.1.0.tar.gz (11.8 kB view details)

Uploaded Mar 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

thinkstrip-0.1.0-py3-none-any.whl (7.1 kB view details)

Uploaded Mar 25, 2026 Python 3

File details

Details for the file thinkstrip-0.1.0.tar.gz.

File metadata

Download URL: thinkstrip-0.1.0.tar.gz
Upload date: Mar 25, 2026
Size: 11.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for thinkstrip-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`b10d6e6accd8592c50c9a7630f0cc35ea70997e0c66b1e2af52bf1ea34940d61`
MD5	`91fd4bd28e6c9c1cce9a815e36d6b310`
BLAKE2b-256	`4530802e585ae7cd61b7b9501c58b166c1847d24e4056dbc15d3cc3b39a6119d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for thinkstrip-0.1.0.tar.gz:

Publisher: publish.yml on informity/thinkstrip

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: thinkstrip-0.1.0.tar.gz
- Subject digest: b10d6e6accd8592c50c9a7630f0cc35ea70997e0c66b1e2af52bf1ea34940d61
- Sigstore transparency entry: 1181827830
- Sigstore integration time: Mar 25, 2026
Source repository:
- Permalink: informity/thinkstrip@d57493ed7efa022c189b55f470d4b6609c2ff983
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/informity
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@d57493ed7efa022c189b55f470d4b6609c2ff983
- Trigger Event: push

File details

Details for the file thinkstrip-0.1.0-py3-none-any.whl.

File metadata

Download URL: thinkstrip-0.1.0-py3-none-any.whl
Upload date: Mar 25, 2026
Size: 7.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for thinkstrip-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b97d6a8b9ac82af60f7ce9032d81dfb6348b8da85790ab8777b030a97d3b48d9`
MD5	`5ccc1498ff8ce0cf6094e5489f211500`
BLAKE2b-256	`31a4b2918d7709c6feb370b1d838dd928f69f5e592039adb1ec10d1bee7c0934`

See more details on using hashes here.

Provenance

The following attestation bundles were made for thinkstrip-0.1.0-py3-none-any.whl:

Publisher: publish.yml on informity/thinkstrip

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: thinkstrip-0.1.0-py3-none-any.whl
- Subject digest: b97d6a8b9ac82af60f7ce9032d81dfb6348b8da85790ab8777b030a97d3b48d9
- Sigstore transparency entry: 1181827836
- Sigstore integration time: Mar 25, 2026
Source repository:
- Permalink: informity/thinkstrip@d57493ed7efa022c189b55f470d4b6609c2ff983
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/informity
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@d57493ed7efa022c189b55f470d4b6609c2ff983
- Trigger Event: push

thinkstrip 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

thinkstrip — Streaming Think-Block Stripper for LLM Output

Why this exists

Install

Quick start

Streaming

Async streaming

Batch

Prompt pre-cleaner

Public API

ThinkStrip

AsyncThinkStrip

strip_think

strip_think_prefill

Capture mode

Limitations

Development

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`ThinkStrip`

`AsyncThinkStrip`

`strip_think`

`strip_think_prefill`