Skip to main content

AI Data Infrastructure: Declarative, Multimodal, and Incremental

Project description

Pixeltable Logo

License PyPI Package Python tests status nightly status stress-tests status

Quick Start | Documentation | API Reference | Starter Kit | AI Coding Skill | Pixeltable Cloud | Discord

Pixeltable is the unified multimodal backend for AI applications. One pip install, one Python API, one place to store, transform, index, retrieve, serve, version, observe, and debug.

Every multimodal AI app needs the same five things: store media, run models, index embeddings, serve endpoints, version everything. Most teams glue together 5-8 services and spend more time on infrastructure than on the product. Pixeltable is a single system that handles all five.

What you need Without Pixeltable With Pixeltable
Store video, images, docs S3 + Postgres + glue code pxt.create_table() with media types
Run AI on every insert Airflow DAGs + retry logic add_computed_column(), automatic
Vector search Pinecone + ETL pipelines add_embedding_index(), always in sync
HTTP endpoints Hand-written + Pydantic FastAPIRouter or pxt serve
Versioning & rollback Custom scripts Built-in history(), revert()

Transaction integrity, async execution, parallelization, caching, retries, and observability are built in. Schema changes are one line. Model upgrades are zero-downtime. Extensible via @pxt.udf, @pxt.uda, @pxt.query.

Deployment options: Pixeltable can serve as your full backend (managing media locally or syncing with S3/GCS/Azure, plus built-in vector search and orchestration) or as an orchestration layer alongside your existing infrastructure.


Installation

pip install pixeltable

Pixeltable bundles its own transactional database, orchestration engine, and local dashboard. No Docker, no external services; pip install is all you need. All data is managed in ~/.pixeltable and accessed through the Python SDK. See Working with External Files and Storage Architecture for details.

AI Agent Skill

Teach AI coding assistants (Cursor, Claude Code, Windsurf, Copilot, etc.) to write correct Pixeltable code:

npx skills add pixeltable/pixeltable-skill

Covers 25+ providers, multimodal pipelines, tool-calling agents, RAG, and production patterns. Learn more →

Quick Start

Define your data processing and AI workflow declaratively using computed columns on tables. Focus on your logic, not the data plumbing.

pip install pixeltable google-genai torch transformers scenedetect

Set your API keys via environment variables or ~/.pixeltable/config.toml. See Configuration for all provider keys and options.

import pixeltable as pxt
from pixeltable.functions import gemini, huggingface

videos = pxt.create_table('video_search', {'video': pxt.Video, 'title': pxt.String})

videos.add_computed_column(scenes=videos.video.scene_detect_adaptive())

videos.add_computed_column(
    response=gemini.generate_content(
        [videos.video, 'Describe this video in detail.'], model='gemini-3-flash-preview'
    )
)

videos.add_computed_column(
    description=videos.response.candidates[0].content.parts[0].text.astype(pxt.String)
)

videos.add_embedding_index('description', embedding=gemini.embed_content.using(model='gemini-embedding-2-preview'))

base_url = 'https://raw.githubusercontent.com/pixeltable/pixeltable/release/docs/resources'
videos.insert([
    {'video': f'{base_url}/bangkok.mp4', 'title': 'Bangkok Street Tour'},
    {'video': f'{base_url}/The-Pursuit-of-Happiness-Video-Extract.mp4', 'title': 'The Pursuit of Happiness'},
])

videos.select(
    videos.title,
    videos.description,
    detections=huggingface.detr_for_object_detection(
        videos.video.extract_frame(timestamp=2.0),
        model_id='facebook/detr-resnet-50',
    ),
).collect()

sim = videos.description.similarity(string='street food')
videos.order_by(sim, asc=False).limit(5).select(videos.title, sim).collect()

Wrap any query as an HTTP endpoint and serve it:

@pxt.query
def search_videos(query_text: str, limit: int = 5):
    sim = videos.description.similarity(string=query_text)
    return videos.order_by(sim, asc=False).limit(limit).select(videos.title, videos.description, sim)
# service.toml
[[service.routes]]
type = "query"
path = "/search"
query = "video_search_app.search_videos"

[[service.routes]]
type = "insert"
table = "video_search"
path = "/ingest"
inputs = ["video", "title"]
outputs = ["title", "description"]
pxt serve my-service --config service.toml
# curl -X POST localhost:8000/search -d '{"query_text": "street food"}'

Storage, orchestration, retrieval, and serving in one system. See HTTP Serving for the full guide.

What Pixeltable Does

You Write Pixeltable Does
pxt.Image, pxt.Video, pxt.Document columns Stores media, handles formats, caches from URLs
add_computed_column(fn(...)) Runs incrementally, caches results, retries failures
add_embedding_index(column) Manages vector storage, keeps index in sync
@pxt.udf / @pxt.query Creates reusable functions with dependency tracking
table.insert(...) Triggers all dependent computations automatically
t.sample(5).select(t.text, summary=udf(t.text)) Experiment on a sample; nothing stored, calls parallelized and cached
table.select(...).collect() Returns structured + unstructured data together
(nothing; it's automatic) Versions all data and schema changes for time-travel

Pixeltable ships with built-in functions for media processing (FFmpeg, Pillow, spaCy), embeddings (sentence-transformers, CLIP), and 30+ AI providers (OpenAI, Anthropic, Gemini, Ollama, and more). For anything domain-specific, wrap your own logic with @pxt.udf. You still write the application layer (FastAPI, React, Docker).

Demo

See Pixeltable in action: table creation, computed columns, multimodal processing, and querying in a single workflow.

https://github.com/user-attachments/assets/b50fd6df-5169-4881-9dbe-1b6e5d06cede

Core Capabilities

Store: Unified Multimodal Interface

pxt.Image, pxt.Video, pxt.Audio, pxt.Document, pxt.Json – manage diverse data consistently.

t = pxt.create_table(
    'media',
    {
        'img': pxt.Image,
        'video': pxt.Video,
        'audio': pxt.Audio,
        'document': pxt.Document,
        'metadata': pxt.Json,
    },
)

Type System · Tables & Data

Orchestrate: Declarative Computed Columns

Define processing steps once; they run automatically on new/updated data. Supports API calls (OpenAI, Anthropic, Gemini), local inference (Hugging Face, YOLOX, Whisper), vision models, and any Python logic.

# LLM API call
t.add_computed_column(
    summary=openai.chat_completions(
        messages=[{'role': 'user', 'content': t.text}], model='gpt-4o-mini'
    )
)

# Local model inference
t.add_computed_column(
    classification=huggingface.vit_for_image_classification(t.image)
)

# Vision analysis (multimodal)
t.add_computed_column(
    description=openai.chat_completions(
        messages=[{'role': 'user', 'content': [
            {'type': 'text', 'text': 'Describe this image'},
            {'type': 'image_url', 'image_url': t.image},
        ]}],
        model='gpt-4o-mini'
    )
)

Computed Columns · AI Integrations · Sample App: Prompt Studio

Iterate: Explode & Process Media

Create views with iterators to explode one row into many (video→frames, doc→chunks, audio→segments).

from pixeltable.functions.video import frame_iterator
from pixeltable.functions.document import document_splitter

# Document chunking with overlap & metadata
chunks = pxt.create_view(
    'chunks', docs,
    iterator=document_splitter(
        document=docs.doc,
        separators='sentence,token_limit',
        overlap=50, limit=500
    )
)

# Video frame extraction
frames = pxt.create_view(
    'frames', videos,
    iterator=frame_iterator(video=videos.video, fps=0.5)
)

Views · Iterators · RAG Pipeline

Index: Built-in Vector Search

Add embedding indexes and perform similarity searches directly on tables/views.

t.add_embedding_index(
    'img',
    embedding=clip.using(model_id='openai/clip-vit-base-patch32')
)

sim = t.img.similarity(string='cat playing with yarn')
results = t.order_by(sim, asc=False).limit(10).collect()

Embedding Indexes · Semantic Search · Image Search App

Extend: Bring Your Own Code

Extend Pixeltable with UDFs, reusable queries, batch processing, and custom aggregators.

@pxt.udf
def format_prompt(context: list, question: str) -> str:
    return f'Context: {context}\nQuestion: {question}'

@pxt.query
def search_by_topic(topic: str):
    return t.where(t.category == topic).select(t.title, t.summary)

UDFs Guide · Custom Aggregates

Agents & Tools: Tool Calling & MCP Integration

Register @pxt.udf, @pxt.query functions, or MCP servers as callable tools. LLMs decide which tool to invoke; Pixeltable executes and stores results.

# Load tools from MCP server, UDFs, and query functions
mcp_tools = pxt.mcp_udfs('http://localhost:8000/mcp')
tools = pxt.tools(get_weather_udf, search_context_query, *mcp_tools)

# LLM decides which tool to call; Pixeltable executes it
t.add_computed_column(
    tool_output=invoke_tools(tools, t.llm_tool_choice)
)

Tool Calling Cookbook · Agents & MCP · Pixelbot · Pixelagent

Serve: Expose Tables & Queries as HTTP Endpoints

Expose any table or @pxt.query as an HTTP endpoint with a TOML config or a single Python call. FastAPIRouter is a drop-in subclass of FastAPI's APIRouter, so declarative and hand-written routes coexist on the same router.

# service.toml
[[service.routes]]
type = "insert"
table = "myapp/docs"
path = "/ingest"
inputs = ["document"]
outputs = ["document", "summary"]
pxt serve my-service --config service.toml
from pixeltable.serving import FastAPIRouter

router = FastAPIRouter(prefix="/api", tags=["data"])
router.add_query_route(path="/search", query=search_documents)
router.add_insert_route(table, path="/upload", uploadfile_inputs=["image"])

HTTP Serving Guide · Migrating from Hand-Written Endpoints · Deployment Overview

Query & Experiment: The Best Path from Prototype to Production

Unlike pandas/polars, Pixeltable persists everything, parallelizes API calls automatically, caches results, and turns your experiment into production with one line change. No separate notebook → pipeline handoff:

# Explore: filter, sample, apply UDFs ephemerally
results = (
    t.where(t.score > 0.8)
    .order_by(t.timestamp)
    .select(t.image, score=t.score)
    .limit(10)
    .collect()
)

# Sample 5 rows and test a UDF (nothing stored, calls parallelized and cached)
t.sample(5).select(t.text, summary=summarize(t.text)).collect()

# Happy? One line to commit; runs on full dataset, skips already-cached rows
t.add_computed_column(summary=summarize(t.text))

Queries & Expressions · Iterative Workflow · Version Control

Version: Data Persistence & Time Travel

All data is automatically stored and versioned. Query any prior version.

t = pxt.get_table('my_table')  # Get a handle to an existing table
t.revert()  # Undo the last modification

t.history()  # Display all prior versions
old_version = pxt.get_table('my_table:472')  # Query a specific version

Version Control · Data Sharing

Inspect: Local Dashboard

Pixeltable ships with a built-in local dashboard that launches automatically when you start a session. Browse tables, inspect schemas, view media with lightbox navigation, visualize your full data pipeline as a DAG, and track computation errors, all from your browser.

import pixeltable as pxt

# Dashboard launches automatically at http://localhost:22089
pxt.init()

# Disable if needed
pxt.init(config_overrides={'start_dashboard': False})
# Or set environment variable: PIXELTABLE_START_DASHBOARD=false

Highlights: Table browser with sorting & filtering · Media preview (images, video, audio) · Column lineage visualization · Pipeline graph · Per-column error tracking · CSV export · Auto-refresh

No extra dependencies. No setup. It's just there.

Import/Export: I/O & Integration

Import from any source and export to ML formats.

# Import from files, URLs, S3, Hugging Face
t.insert(pxt.io.import_csv('data.csv'))
t.insert(pxt.io.import_huggingface_dataset(dataset))

# Export to analytics/ML formats
pxt.io.export_parquet(table, 'data.parquet')
pytorch_ds = table.to_pytorch_dataset('pt')  # → PyTorch DataLoader ready
coco_path = table.to_coco_dataset()          # → COCO annotations

# ML tool integrations
pxt.create_label_studio_project(table, label_config)  # Annotation
pxt.export_images_as_fo_dataset(table, table.image)   # FiftyOne

Data Import · PyTorch Export · Label Studio · Data Wrangling for ML

Tutorials & Cookbooks

Fundamentals Cookbooks Providers Sample Apps
Colab Colab OpenAI GitHub
Colab Colab Anthropic GitHub
Colab Colab Gemini GitHub
Colab Colab Ollama GitHub
Colab Colab DeepSeek Discord
All → All → All providers → All →

External Storage and Pixeltable Cloud

S3 GCS Azure R2 B2 Tigris

Store computed media using the destination parameter on columns, or set defaults globally via PIXELTABLE_OUTPUT_MEDIA_DEST and PIXELTABLE_INPUT_MEDIA_DEST. See Configuration.

Data Sharing: Publish datasets to Pixeltable Cloud for team collaboration or public sharing. Replicate public datasets instantly; no account needed for replication.

import pixeltable as pxt

# Replicate a public dataset (no account required)
coco = pxt.replicate(
    remote_uri='pxt://pixeltable:fiftyone/coco_mini_2017',
    local_path='coco-copy'
)

# Publish your own dataset (requires free account)
pxt.publish(source='my-table', destination_uri='pxt://myorg/my-dataset')

# Store computed media in external cloud storage
t.add_computed_column(
    thumbnail=t.image.resize((256, 256)),
    destination='s3://my-bucket/thumbnails/'
)

Data Sharing Guide | Cloud Storage | Public Datasets

Built with Pixeltable

Project Description
Starter Kit Production-ready FastAPI + React app with deployment configs for Docker, Helm, Terraform (EKS/GKE/AKS), and AWS CDK
Pixelbot Multimodal AI agent, an interactive data studio with on-demand ML inference, media generation, and a database explore
Pixelagent Lightweight agent framework with built-in memory and tool orchestration
Pixelmemory Persistent memory layer for AI applications
Skill AI coding skill for Cursor, Claude Code, Copilot, Windsurf, and other AI IDEs; reduces hallucination and generates accurate Pixeltable code
MCP Server Model Context Protocol server for Claude, Cursor, and other AI IDEs

Contributing

We love contributions! Whether it's reporting bugs, suggesting features, improving documentation, or submitting code changes, please check out our Contributing Guide and join the Discussions or our Discord Server.

License

Pixeltable is licensed under the Apache 2.0 License.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pixeltable-0.6.2-py3-none-any.whl (1.0 MB view details)

Uploaded Python 3

File details

Details for the file pixeltable-0.6.2-py3-none-any.whl.

File metadata

File hashes

Hashes for pixeltable-0.6.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c11abeea5d552f90c05800ff3724f079c6c17936d29fb0e6bdf6ef5c954b9068
MD5 8ca18abd3e0287e74eb157e7bc8b0f2a
BLAKE2b-256 2bdd5492f9bacf5f58088efba259b43799276d99ace2452611afc947f8139663

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page