Skip to main content

Compress LLM context by rendering text as optimized images

Project description

PixelPrompt

Compress LLM context by rendering text as optimized images. Based on the research paper "Pixels Beat Tokens: Multimodal LLMs See Better With Image Sources for Text-Rich VQA".

Why PixelPrompt?

When working with LLMs, token counts directly impact cost and latency. PixelPrompt converts text content into visually optimized PNG images, achieving 4-8x compression compared to raw text tokens, while maintaining or improving accuracy.

Key benefits:

  • 🎯 Significant token savings — text rendered as images uses fewer tokens
  • 📊 Flexible formatting — control font size, layout, and visual hierarchy
  • 🔄 Automatic splitting — large content automatically split across multiple images
  • 🎨 Configurable rendering — customize fonts, colors, background
  • 🚀 Easy integration — simple API for any LLM workflow

Installation

uv pip install pixelprompt

Or with pip:

pip install pixelprompt

Quick Start

from pixelprompt import PixelPrompt

# Initialize with default settings
pxl = PixelPrompt()

# Render text as image(s)
text = "Your long context here..."
images = pxl.render(text)

# Use with Claude API
from anthropic import Anthropic

client = Anthropic()
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Analyze this document:"
                },
                *[
                    {
                        "type": "image",
                        "source": {
                            "type": "base64",
                            "media_type": "image/png",
                            "data": img.base64()
                        }
                    }
                    for img in images
                ],
                {
                    "type": "text",
                    "text": "What are the key points?"
                }
            ]
        }
    ]
)

print(message.content[0].text)

Configuration

from pixelprompt import PixelPrompt, RenderConfig

config = RenderConfig(
    font_size=9,  # Default: 9 (range: 6-20)
    font_family="monospace",  # Default: "monospace"
    width=1568,  # Image width in pixels (default: 1568)
    height=1568,  # Image height in pixels (default: 1568)
    background_color=(255, 255, 255),  # RGB tuple (default: white)
    text_color=(0, 0, 0),  # RGB tuple (default: black)
    padding=20,  # Padding in pixels (default: 20)
    line_spacing=1.2,  # Line height multiplier (default: 1.2)
)

pxl = PixelPrompt(config=config)
images = pxl.render(text)

Advanced Usage

Analyze compression metrics

from pixelprompt import estimate_tokens

text = "Your long context..."
original_tokens = estimate_tokens(text)
compressed_tokens = estimate_tokens(f"[Image with compressed content]")

compression_ratio = original_tokens / compressed_tokens
print(f"Compression: {compression_ratio:.1f}x")

Handle large documents

# Automatically splits into multiple images if content exceeds limits
images = pxl.render(long_document)
print(f"Generated {len(images)} images")

# Access individual images
for i, img in enumerate(images):
    img.save(f"page_{i}.png")
    print(f"Image {i}: {img.width}x{img.height}, size: {img.size_bytes} bytes")

Custom fonts

config = RenderConfig(
    font_family="serif",  # Options: "monospace", "serif", "sans-serif"
    font_size=10,
)
pxl = PixelPrompt(config=config)

API Reference

PixelPrompt

Main class for rendering text to images.

class PixelPrompt:
    def __init__(self, config: RenderConfig | None = None):
        """Initialize with optional configuration."""

    def render(self, text: str) -> list[RenderedImage]:
        """
        Render text to one or more PNG images.

        Args:
            text: Text content to render

        Returns:
            List of RenderedImage objects
        """

RenderConfig

Configuration dataclass for rendering parameters.

@dataclass
class RenderConfig:
    font_size: int = 9
    font_family: str = "monospace"
    width: int = 1568
    height: int = 1568
    background_color: tuple[int, int, int] = (255, 255, 255)
    text_color: tuple[int, int, int] = (0, 0, 0)
    padding: int = 20
    line_spacing: float = 1.2

RenderedImage

Represents a single rendered image.

class RenderedImage:
    width: int
    height: int
    size_bytes: int

    def png_bytes(self) -> bytes:
        """Get raw PNG bytes."""

    def base64(self) -> str:
        """Get base64-encoded PNG for API integration."""

    def save(self, path: str) -> None:
        """Save to file."""

Performance

Typical compression ratios (depends on content):

  • Code: 4-6x compression
  • Technical prose: 5-8x compression
  • JSON/Structured data: 3-5x compression
  • Natural language: 4-7x compression

Rendering time: ~100-200ms per image on modern hardware.

Contributing

Contributions welcome! Please open issues or PRs on GitHub.

License

MIT License — see LICENSE file for details.

Citation

If you use PixelPrompt in research, please cite:

@software{pixelprompt,
  author = {Venturi, Gabriele},
  title = {PixelPrompt: Compress LLM Context by Rendering Text as Images},
  year = {2026},
  url = {https://github.com/sinaptik-ai/pixelprompt}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pixelprompt-0.1.0.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pixelprompt-0.1.0-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file pixelprompt-0.1.0.tar.gz.

File metadata

  • Download URL: pixelprompt-0.1.0.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pixelprompt-0.1.0.tar.gz
Algorithm Hash digest
SHA256 44e2191e365168cc32e9ce556e432a7c6e428bf9f075d61202f03ba47f41d9eb
MD5 bd05b8c61d69662b98b995050c0f57c5
BLAKE2b-256 74cc7da349141857ba831169f9d5b484bed2ba0200dc5648cc7718ef3c8edfec

See more details on using hashes here.

Provenance

The following attestation bundles were made for pixelprompt-0.1.0.tar.gz:

Publisher: publish.yml on sinaptik-ai/pixelprompt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pixelprompt-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pixelprompt-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pixelprompt-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5e2e28e869af00a0658f858dd244a01bfbb16f59a9188eb529805ab339eabb5f
MD5 d619205ae0a2f823bc0725677b365cf0
BLAKE2b-256 5ba004b39639f1dab0e88c410fefb76381880e76c35c21606e8cbb81431aebbe

See more details on using hashes here.

Provenance

The following attestation bundles were made for pixelprompt-0.1.0-py3-none-any.whl:

Publisher: publish.yml on sinaptik-ai/pixelprompt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page