Skip to main content

Pixeltable: The Multimodal AI Data Plane

Project description

Pixeltable

Unifying Data, Models, and Orchestration for AI Products

License PyPI - Python Version Platform Support pytest status PyPI Package

Installation | Documentation | API Reference | Code Samples | Examples

Pixeltable is a Python library that lets AI engineers and data scientists focus on exploration, modeling, and app development without dealing with the customary data plumbing.

What problems does Pixeltable solve?

Today’s solutions for AI app development require extensive custom coding and infrastructure plumbing. Tracking lineage and versions between and across data transformations, models, and deployment is cumbersome. With Pixeltable you can store, transform, index, and iterate on your data within the same table interface, whether it's text, images, embeddings, or even video. Built-in lineage and versioning ensure transparency and reproducibility, while the development-to-production mirror streamlines deployment.

💾 Installation

%pip install pixeltable

To verify that it's working:

import pixeltable as pxt
pxt.init()

[!NOTE] Check out the Pixeltable Basics tutorial for a tour of its most important features.

💡 Get Started

Learn how to create tables, populate them with data, and enhance them with built-in or user-defined transformations and AI operations.

Topic Notebook API
Get Started Open In Colab API
User-Defined Functions (UDFs) Open In Colab API
Comparing Object Detection Models Open In Colab API
Experimenting with Chunking (RAG) Open In Colab API
Working with External Files Open In Colab API

❓ FAQ

What does Pixeltable provide me with? Pixeltable provides:

  • Data storage and versioning
  • Combined Data and Model Lineage
  • Indexing (e.g. embedding vectors) and Data Retrieval
  • Orchestration of multimodal workloads
  • Incremental updates
  • Code is automatically production-ready

Why should you use Pixeltable?

  • It gives you transparency and reproducibility
    • All generated data is automatically recorded and versioned
    • You will never need to re-run a workload because you lost track of the input data
  • It saves you money
    • All data changes are automatically incremental
    • You never need to re-run pipelines from scratch because you’re adding data
  • It integrates with any existing Python code or libraries
    • Bring your ever-changing code and workloads
    • You choose the models, tools, and AI practices (e.g., your embedding model for a vector index); Pixeltable orchestrates the data

What is Pixeltable not providing?

  • Pixeltable is not a low-code, prescriptive AI solution. We empower you to use the best frameworks and techniques for your specific needs.
  • We do not aim to replace your existing AI toolkit, but rather enhance it by streamlining the underlying data infrastructure and orchestration.

[!TIP] Check out the Integrations section, and feel free to submit a request for additional ones.

📙 Example of Use Cases

  • Interact with video data at the frame level without having to think about frame extraction, intermediate file storage, or storage space explosion.
  • Augment your data incrementally and interactively with built-in functions and UDFs, such as image transformations, model inference, and visualizations, without having to think about data pipelines, incremental updates, or capturing function output.
  • Interact with all the data relevant to your AI application (video, images, documents, audio, structured data, JSON) through a simple dataframe-style API directly in Python. This includes:
    • similarity search on embeddings, supported by high-dimensional vector indexing;
    • path expressions and transformations on JSON data;
    • PIL and OpenCV image operations;
    • assembling frames into videos.
  • Perform keyword and image similarity search at the video frame level without having to worry about frame storage.
  • Access all Pixeltable-resident data directly as a PyTorch dataset in your training scripts.
  • Understand the compute and storage costs of your data at the granularity of individual augmentations and get cost projections before adding new data and new augmentations.
  • Rely on Pixeltable's automatic versioning and snapshot functionality to protect against regressions and to ensure reproducibility.

🐛 Contributions & Feedback

Are you experiencing issues or bugs with Pixeltable? File an Issue.
Do you want to contribute? Feel free to open a PR.

:classical_building: License

This library is licensed under the Apache 2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pixeltable-0.2.9.tar.gz (195.7 kB view details)

Uploaded Source

Built Distribution

pixeltable-0.2.9-py3-none-any.whl (247.3 kB view details)

Uploaded Python 3

File details

Details for the file pixeltable-0.2.9.tar.gz.

File metadata

  • Download URL: pixeltable-0.2.9.tar.gz
  • Upload date:
  • Size: 195.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.9.19 Darwin/23.4.0

File hashes

Hashes for pixeltable-0.2.9.tar.gz
Algorithm Hash digest
SHA256 3538b4746b1b6eef44ed337b18e72f5a92454a533119d2453f7f666dadaa96af
MD5 134ddb07e31c0a0932bf9c21b17861c4
BLAKE2b-256 7237d4b0d6005851560a2350e3b2ee7dffddc8295fcbda5d6496f0c267fc98cb

See more details on using hashes here.

File details

Details for the file pixeltable-0.2.9-py3-none-any.whl.

File metadata

  • Download URL: pixeltable-0.2.9-py3-none-any.whl
  • Upload date:
  • Size: 247.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.9.19 Darwin/23.4.0

File hashes

Hashes for pixeltable-0.2.9-py3-none-any.whl
Algorithm Hash digest
SHA256 6546078db806a0fafbcad6b148952dceba5a96e0802e1e43fa2d789f40fd741f
MD5 b54425337647f499a7df367ad8a5fbdd
BLAKE2b-256 85e51c40aa7ea85b7502e0b54a1f09693009dd6f37a8e3ad6d299d000ee2ff36

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page