Skip to main content

Compress LLM context by rendering text as optimized images

Project description

PixelPrompt

Compress LLM context by rendering text as optimized images. Based on the research paper "Pixels Beat Tokens: Multimodal LLMs See Better With Image Sources for Text-Rich VQA".

Why PixelPrompt?

When working with LLMs, token counts directly impact cost and latency. PixelPrompt converts text content into visually optimized PNG images, achieving 4-8x compression compared to raw text tokens, while maintaining or improving accuracy.

Key benefits:

  • 🎯 Significant token savings — text rendered as images uses fewer tokens
  • 📊 Flexible formatting — control font size, layout, and visual hierarchy
  • 🔄 Automatic splitting — large content automatically split across multiple images
  • 🎨 Configurable rendering — customize fonts, colors, background
  • 🚀 Easy integration — simple API for any LLM workflow

Installation

uv pip install pixelprompt

Or with pip:

pip install pixelprompt

Quick Start

from pixelprompt import PixelPrompt

# Initialize with default settings
pxl = PixelPrompt()

# Render text as image(s)
text = "Your long context here..."
images = pxl.render(text)

# Use with Claude API
from anthropic import Anthropic

client = Anthropic()
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Analyze this document:"
                },
                *[
                    {
                        "type": "image",
                        "source": {
                            "type": "base64",
                            "media_type": "image/png",
                            "data": img.base64()
                        }
                    }
                    for img in images
                ],
                {
                    "type": "text",
                    "text": "What are the key points?"
                }
            ]
        }
    ]
)

print(message.content[0].text)

Configuration

from pixelprompt import PixelPrompt, RenderConfig

config = RenderConfig(
    font_size=9,  # Default: 9 (range: 6-20)
    font_family="monospace",  # Default: "monospace"
    width=1568,  # Image width in pixels (default: 1568)
    height=1568,  # Image height in pixels (default: 1568)
    background_color=(255, 255, 255),  # RGB tuple (default: white)
    text_color=(0, 0, 0),  # RGB tuple (default: black)
    padding=20,  # Padding in pixels (default: 20)
    line_spacing=1.2,  # Line height multiplier (default: 1.2)
)

pxl = PixelPrompt(config=config)
images = pxl.render(text)

Advanced Usage

Analyze compression metrics

from pixelprompt import estimate_tokens

text = "Your long context..."
original_tokens = estimate_tokens(text)
compressed_tokens = estimate_tokens(f"[Image with compressed content]")

compression_ratio = original_tokens / compressed_tokens
print(f"Compression: {compression_ratio:.1f}x")

Handle large documents

# Automatically splits into multiple images if content exceeds limits
images = pxl.render(long_document)
print(f"Generated {len(images)} images")

# Access individual images
for i, img in enumerate(images):
    img.save(f"page_{i}.png")
    print(f"Image {i}: {img.width}x{img.height}, size: {img.size_bytes} bytes")

Custom fonts

config = RenderConfig(
    font_family="serif",  # Options: "monospace", "serif", "sans-serif"
    font_size=10,
)
pxl = PixelPrompt(config=config)

API Reference

PixelPrompt

Main class for rendering text to images.

class PixelPrompt:
    def __init__(self, config: RenderConfig | None = None):
        """Initialize with optional configuration."""

    def render(self, text: str) -> list[RenderedImage]:
        """
        Render text to one or more PNG images.

        Args:
            text: Text content to render

        Returns:
            List of RenderedImage objects
        """

RenderConfig

Configuration dataclass for rendering parameters.

@dataclass
class RenderConfig:
    font_size: int = 9
    font_family: str = "monospace"
    width: int = 1568
    height: int = 1568
    background_color: tuple[int, int, int] = (255, 255, 255)
    text_color: tuple[int, int, int] = (0, 0, 0)
    padding: int = 20
    line_spacing: float = 1.2

RenderedImage

Represents a single rendered image.

class RenderedImage:
    width: int
    height: int
    size_bytes: int

    def png_bytes(self) -> bytes:
        """Get raw PNG bytes."""

    def base64(self) -> str:
        """Get base64-encoded PNG for API integration."""

    def save(self, path: str) -> None:
        """Save to file."""

Performance

Typical compression ratios (depends on content):

  • Code: 4-6x compression
  • Technical prose: 5-8x compression
  • JSON/Structured data: 3-5x compression
  • Natural language: 4-7x compression

Rendering time: ~100-200ms per image on modern hardware.

Contributing

Contributions welcome! Please open issues or PRs on GitHub.

License

MIT License — see LICENSE file for details.

Citation

If you use PixelPrompt in research, please cite:

@software{pixelprompt,
  author = {Venturi, Gabriele},
  title = {PixelPrompt: Compress LLM Context by Rendering Text as Images},
  year = {2026},
  url = {https://github.com/sinaptik-ai/pixelprompt}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pixelprompt-0.1.1.tar.gz (9.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pixelprompt-0.1.1-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file pixelprompt-0.1.1.tar.gz.

File metadata

  • Download URL: pixelprompt-0.1.1.tar.gz
  • Upload date:
  • Size: 9.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pixelprompt-0.1.1.tar.gz
Algorithm Hash digest
SHA256 96fb3c71ca01461aba8e6e31b08f9e944874e79a6ef93e5403f25cb2e6d382b5
MD5 358f4d0d0aa7bec407b2a7a429c2b910
BLAKE2b-256 a43b3c1eabbf06025b51478aec0b4b8abf53794111a999995f76eb28ecaa528e

See more details on using hashes here.

Provenance

The following attestation bundles were made for pixelprompt-0.1.1.tar.gz:

Publisher: publish.yml on sinaptik-ai/pixelprompt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pixelprompt-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pixelprompt-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 7.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pixelprompt-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 28ad0e4593c7886128df60b8c0bb01e3b443610b8bbd3f1ee20c79a1cc436038
MD5 a2d5160a5977c99ead6c6df6913ec225
BLAKE2b-256 20cee343d1df83401705e69399083abdd9095e1b276930036ed37da3a46aa4b6

See more details on using hashes here.

Provenance

The following attestation bundles were made for pixelprompt-0.1.1-py3-none-any.whl:

Publisher: publish.yml on sinaptik-ai/pixelprompt

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page