Normalize LangChain, MCP, and multimodal content blocks into provider-ready text and image payloads.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

langchain-content-normalizer

Normalize the messy content shapes produced by LangChain, MCP tools, Anthropic content blocks, and multimodal chat APIs.

The package has no runtime dependencies. It works by duck typing instead of importing LangChain or MCP classes.

What it solves

LLM agent stacks often receive content as one of many incompatible shapes:

Source	Example shape	Output
Classic chat	`"plain text"`	`"plain text"`
Anthropic blocks	`[{"type": "text", "text": "hi"}]`	`"hi"`
OpenAI Responses text	`[{"type": "output_text", "text": "hi"}]`	`"hi"`
Tool calls	`[{"type": "tool_use", ...}]`	skipped by default
MCP tool results	`[{"type": "tool_result", "content": [...]}]`	flattened text
MCP objects	objects exposing `.text`	extracted text
Message wrappers	objects exposing `.content`	recursively normalized

Install

uv add langchain-content-normalizer

Text normalization

from lc_content_normalizer import extract_text_content, normalize_tool_output

content = [
    {"type": "text", "text": "Reading logs..."},
    {"type": "tool_use", "name": "tail_logs", "input": {"service": "api"}},
]

assert extract_text_content(content) == "Reading logs..."
assert "tail_logs" in extract_text_content(content, skip_tool_use=False)
assert extract_text_content(content, separator="\n") == "Reading logs..."

safe_output = normalize_tool_output(huge_tool_payload, max_chars=50_000, separator="\n")

Vision format routing

from lc_content_normalizer import build_human_message_content, detect_vision_format

vision_format = detect_vision_format("anthropic", "claude-3-5-sonnet")
content = build_human_message_content(
    "Explain this alert screenshot",
    images=[{"data_url": "data:image/png;base64,...", "mime_type": "image/png"}],
    vision_format=vision_format,
)

detect_vision_format() returns:

Provider/model	Format
`anthropic`	native Anthropic `image` block with `source.base64`
`ollama` + known vision model marker (`llava`, `bakllava`, `moondream`, `minicpm-v`, `qwen2-vl`, `llama3.2-vision`, `vision`)	OpenAI-compatible `image_url` block
`ollama` text-only model	`none`, images are dropped
OpenAI-compatible providers	OpenAI-compatible `image_url` block

Examples

examples/normalize_mcp_output.py shows how MCP-style tool results are flattened.
examples/build_vision_content.py shows provider-aware image block generation.

Roadmap

Add provider-specific adapters as content formats evolve.
Keep runtime dependencies at zero.

Strict mode

By default, unknown non-empty content is preserved with str(...) so tool output is not silently lost. Use strict mode when unknown shapes should fail fast:

from lc_content_normalizer import UnknownContentBlockError, extract_text_content

try:
    extract_text_content([{"type": "custom", "payload": "..."}], strict=True)
except UnknownContentBlockError:
    ...

Development

uv sync --dev
uv run ruff check .
uv run pytest
uv run python scripts/smoke.py
uv build

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

benjamin-j

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.8

Jun 3, 2026

0.1.7

Jun 2, 2026

0.1.6

Jun 1, 2026

0.1.5

Jun 1, 2026

0.1.4

Jun 1, 2026

0.1.3

Jun 1, 2026

0.1.2

Jun 1, 2026

0.1.0

Jun 1, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_content_normalizer-0.1.8.tar.gz (15.7 kB view details)

Uploaded Jun 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

langchain_content_normalizer-0.1.8-py3-none-any.whl (7.1 kB view details)

Uploaded Jun 3, 2026 Python 3

File details

Details for the file langchain_content_normalizer-0.1.8.tar.gz.

File metadata

Download URL: langchain_content_normalizer-0.1.8.tar.gz
Upload date: Jun 3, 2026
Size: 15.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for langchain_content_normalizer-0.1.8.tar.gz
Algorithm	Hash digest
SHA256	`c767be702678a7f4826ab1d7873c0c41899480dda2c758539d8840adb311764a`
MD5	`bd15d1226445dd502a42bf922869318f`
BLAKE2b-256	`f56df5590f93ee8ac4f1c17a44cbaa3b047bd8cac481554afcefe5acbd072f8e`

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_content_normalizer-0.1.8.tar.gz:

Publisher: publish.yml on BenjaminJornet/langchain-content-normalizer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: langchain_content_normalizer-0.1.8.tar.gz
- Subject digest: c767be702678a7f4826ab1d7873c0c41899480dda2c758539d8840adb311764a
- Sigstore transparency entry: 1710465034
- Sigstore integration time: Jun 3, 2026
Source repository:
- Permalink: BenjaminJornet/langchain-content-normalizer@e7742383dfb275b35046310beda5890f2edeb309
- Branch / Tag: refs/tags/v0.1.8
- Owner: https://github.com/BenjaminJornet
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e7742383dfb275b35046310beda5890f2edeb309
- Trigger Event: release

File details

Details for the file langchain_content_normalizer-0.1.8-py3-none-any.whl.

File metadata

Download URL: langchain_content_normalizer-0.1.8-py3-none-any.whl
Upload date: Jun 3, 2026
Size: 7.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for langchain_content_normalizer-0.1.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ebddbeeea9e9ce67d1da268632531ae10aea02690e7899a3c9957067aa62f096`
MD5	`f5067340bbff4a5f73f06b6f25cc3b84`
BLAKE2b-256	`b03e36ac431e7a26e11c1f315bcf6122284a497a31da7585b0f83146b7dbb74a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for langchain_content_normalizer-0.1.8-py3-none-any.whl:

Publisher: publish.yml on BenjaminJornet/langchain-content-normalizer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: langchain_content_normalizer-0.1.8-py3-none-any.whl
- Subject digest: ebddbeeea9e9ce67d1da268632531ae10aea02690e7899a3c9957067aa62f096
- Sigstore transparency entry: 1710465051
- Sigstore integration time: Jun 3, 2026
Source repository:
- Permalink: BenjaminJornet/langchain-content-normalizer@e7742383dfb275b35046310beda5890f2edeb309
- Branch / Tag: refs/tags/v0.1.8
- Owner: https://github.com/BenjaminJornet
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@e7742383dfb275b35046310beda5890f2edeb309
- Trigger Event: release

langchain-content-normalizer 0.1.8

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

langchain-content-normalizer

What it solves

Install

Text normalization

Vision format routing

Examples

Roadmap

Strict mode

Development

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance