Provide rendering functions for project PIXAR.
Project description
Pixar-Render
A Python library for rendering text into visual representations as pixel tensors. This project provides rendering functions for the PIXAR project, converting text strings into images with configurable fonts, colors, and patch-based representations suitable for vision-language models.
Features
- Convert text to pixel-based tensor representations
- Configurable font rendering with PangoCairo backend
- Support for batch processing
- Attention mask generation for sequence models
- Patch-based encoding with customizable patch sizes
- Image export capabilities (PIL and file output)
- Configuration save/load functionality
- Text encoding slicing and insertion operations
- White space reduction for compact representations
Installation
Install from PyPI:
pip install Pixar-Render
Or install from source:
git clone https://github.com/TYTTYTTYT/Pixar-Render.git
cd Pixar-Render
pip install -e .
Quick Start
Basic Usage
from pixar_render import PixarProcessor
# Initialize the processor with default settings
processor = PixarProcessor()
# Render a single text string
text = "Hello, World!"
encoding = processor.render(text)
# Access the pixel values and attention mask
print(encoding.pixel_values.shape) # torch.Tensor: [batch_size, channels, height, width]
print(encoding.attention_mask.shape) # torch.Tensor: [batch_size, seq_length]
print(encoding.num_text_patches) # List of patch counts per text
Batch Processing
from pixar_render import PixarProcessor
processor = PixarProcessor()
# Render multiple texts at once
texts = [
"First sentence.",
"Second sentence with more text.",
"Third one."
]
encoding = processor.render(texts)
print(encoding.pixel_values.shape) # [3, 3, 24, 12696]
print(encoding.num_text_patches) # Number of text patches for each input
Custom Configuration
from pixar_render import PixarProcessor
# Initialize with custom settings
processor = PixarProcessor(
font_size=12, # Larger font size
font_color="blue", # Blue text
background_color="lightyellow", # Light yellow background
pixels_per_patch=32, # 32 pixels per patch instead of 24
max_seq_length=1024, # Maximum numer of patches
dpi=240 # Higher DPI for better quality
)
text = "Custom styled text"
encoding = processor.render(text)
Converting to PIL Images
from pixar_render import PixarProcessor
processor = PixarProcessor()
text = "Visualize this text"
encoding = processor.render(text)
# Convert to PIL images (returns a list of PIL.Image objects)
images = processor.convert_to_pil(encoding, square=True, contour=False)
# Display or save the first image
images[0].show()
images[0].save("output.png")
Saving Images to Directory
from pixar_render import PixarProcessor
processor = PixarProcessor()
texts = ["First text", "Second text", "Third text"]
encoding = processor.render(texts)
# Save all rendered images to a directory
processor.save_as_images(
encoding,
dir_path="./output_images",
square=True, # Reshape to square format
contour=False # Don't add contours
)
# This creates: output_images/0.png, output_images/1.png, output_images/2.png
Adding Contours
from pixar_render import PixarProcessor
# Initialize with contour settings
processor = PixarProcessor(
contour_r=1.0, # Red channel
contour_g=0.0, # Green channel
contour_b=0.0, # Blue channel (red contours)
contour_alpha=0.7, # Contour transparency
contour_width=2, # Contour line width
patch_len=1 # Patches per contour cell
)
text = "Text with contours"
encoding = processor.render(text)
# Convert to image with contours
images = processor.convert_to_pil(encoding, square=True, contour=True)
images[0].save("contoured_output.png")
Working with Multi-turn Conversations
from pixar_render import PixarProcessor
processor = PixarProcessor()
# Render conversation turns as tuples
conversation = [
("User: Hello!", "Assistant: Hi there!"),
("User: How are you?", "Assistant: I'm doing well!")
]
encoding = processor.render(conversation)
print(encoding.sep_patches) # Shows separator patch positions
Slicing Encodings
from pixar_render import PixarProcessor
processor = PixarProcessor()
text = "This is a long piece of text"
encoding = processor.render(text)
# Extract patches from index 5 to 15
sliced_encoding = processor.slice(encoding, start=5, end=15)
print(sliced_encoding.pixel_values.shape)
print(sliced_encoding.num_text_patches)
Inserting Encodings
from pixar_render import PixarProcessor
processor = PixarProcessor()
# Create base encoding
base_text = "Hello ___ World"
base_encoding = processor.render(base_text)
# Create text to insert
insert_text = "Beautiful"
insert_encoding = processor.render(insert_text)
# Insert at specific patch positions (e.g., patches 6-10)
combined = processor.insert(base_encoding, start=6, end=10, inserted=insert_encoding)
Reducing White Space
from pixar_render import PixarProcessor
processor = PixarProcessor()
text = "Text with lots of spaces"
encoding = processor.render(text)
# Reduce consecutive white pixels to maximum of 5
compact_encoding = processor.reduce_white_space(encoding, max_white_space=5)
# Display the image
processor.convert_to_pil(compact_encoding)[0]
Saving and Loading Configuration
from pixar_render import PixarProcessor
# Create processor with custom settings
processor = PixarProcessor(
font_size=10,
dpi=200,
pixels_per_patch=28,
max_seq_length=1024
)
# Save configuration
processor.save_conf("./config")
# Creates: ./config/pixar_processor_conf.json
# Later, load the same configuration
loaded_processor = PixarProcessor.load_conf("./config")
Using with PyTorch Models
import torch
from pixar_render import PixarProcessor
processor = PixarProcessor(device='cuda:0')
# Render text
texts = ["Training sample 1", "Training sample 2"]
encoding = processor.render(texts)
# Move to device
encoding = encoding.to('cuda:0')
# Use in your model
# pixel_values: [batch_size, 3, height, width]
# attention_mask: [batch_size, seq_length]
output = your_vision_model(
pixel_values=encoding.pixel_values,
attention_mask=encoding.attention_mask
)
Binary Mode
from pixar_render import PixarProcessor
# Render in binary mode (black and white only)
processor = PixarProcessor(binary=True)
text = "Binary rendered text"
encoding = processor.render(text)
# Pixel values will be 0 or 1
images = processor.convert_to_pil(encoding)
images[0].save("binary_output.png")
API Reference
PixarProcessor
__init__ parameters:
font_file(str): Font file name (default: 'GoNotoCurrent.ttf')font_size(int): Font size in points (default: 8)font_color(str): Text color (default: "black")background_color(str): Background color (default: "white")binary(bool): Binarize output (default: False)rgb(bool): Use RGB mode (default: True)dpi(int): Dots per inch (default: 180)pad_size(int): Padding size (default: 3)pixels_per_patch(int): Pixels per patch (default: 24)max_seq_length(int): Maximum sequence length (default: 529)fallback_fonts_dir(str | None): Directory for fallback fontspatch_len(int): Patch length (default: 1)contour_r(float): Red component of contour (default: 0.0)contour_g(float): Green component of contour (default: 0.0)contour_b(float): Blue component of contour (default: 0.0)contour_alpha(float): Contour transparency (default: 0.7)contour_width(int): Contour line width (default: 1)device(str | int): Processing device (default: 'cpu')
Methods:
render(text): Render text to PixarEncodingconvert_to_pil(encoding, square, contour): Convert to PIL Imagessave_as_images(encoding, dir_path, square, contour): Save images to directoryslice(encoding, start, end): Extract patch rangeinsert(encoding, start, end, inserted): Insert encoding into anotherreduce_white_space(encoding, max_white_space): Reduce white spacesave_conf(dir_path): Save configuration to JSONload_conf(dir_path): Load configuration from JSON (classmethod)
PixarEncoding
Dataclass containing:
pixel_values(torch.Tensor): Rendered pixel values [batch, channels, height, width]attention_mask(torch.Tensor): Attention mask [batch, seq_length]num_text_patches(List[int]): Number of text patches per samplesep_patches(List[List[int]]): Separator patch positions per sample
Methods:
to(device): Move tensors to deviceclone(): Create a deep copy
Requirements
- Python = 3.11
- numpy
- torch
- torchvision
- pillow
- PangoCairo (for text rendering)
License
Apache License 2.0
Links
- Homepage: https://github.com/TYTTYTTYT/Pixar-Render
- Bug Tracker: https://github.com/TYTTYTTYT/Pixar-Render/issues
- PyPI: https://pypi.org/project/Pixar-Render/
Author
Yintao Tai (tai.yintao@gmail.com)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pixar_render-0.1.0.tar.gz.
File metadata
- Download URL: pixar_render-0.1.0.tar.gz
- Upload date:
- Size: 7.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b099dd24b8fc30f33c3890e7ba4097099a442383dd00416367df46ea449aceff
|
|
| MD5 |
4778819f27eed6e5ab030f49242eb396
|
|
| BLAKE2b-256 |
d89cb52e296806793c5371d20534719d73165a48104054609ec28435799568c9
|
Provenance
The following attestation bundles were made for pixar_render-0.1.0.tar.gz:
Publisher:
python-publish.yml on TYTTYTTYT/pixar-render
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pixar_render-0.1.0.tar.gz -
Subject digest:
b099dd24b8fc30f33c3890e7ba4097099a442383dd00416367df46ea449aceff - Sigstore transparency entry: 789935007
- Sigstore integration time:
-
Permalink:
TYTTYTTYT/pixar-render@470b09dbde92d4ba3c10dc4020bc110a58dc1b56 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/TYTTYTTYT
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@470b09dbde92d4ba3c10dc4020bc110a58dc1b56 -
Trigger Event:
release
-
Statement type:
File details
Details for the file pixar_render-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pixar_render-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ffb4b54babde473ae2fa6100dcb2ac300e0d49a4600fbc89ed08fd720ea5d318
|
|
| MD5 |
6d2b03cb135784b736d6b2d21e13d87e
|
|
| BLAKE2b-256 |
21a1c40daf4970633516d02458e16a05c40bce75afc3894e8b02ea6b8413137a
|
Provenance
The following attestation bundles were made for pixar_render-0.1.0-py3-none-any.whl:
Publisher:
python-publish.yml on TYTTYTTYT/pixar-render
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pixar_render-0.1.0-py3-none-any.whl -
Subject digest:
ffb4b54babde473ae2fa6100dcb2ac300e0d49a4600fbc89ed08fd720ea5d318 - Sigstore transparency entry: 789935010
- Sigstore integration time:
-
Permalink:
TYTTYTTYT/pixar-render@470b09dbde92d4ba3c10dc4020bc110a58dc1b56 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/TYTTYTTYT
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@470b09dbde92d4ba3c10dc4020bc110a58dc1b56 -
Trigger Event:
release
-
Statement type: