Skip to main content

Lightweight ColPali-based retrieval for cloud

Project description

LitePali Logo

LitePali

Lightweight Document Retrieval with Vision Language Models

Python Version License Pytorch Version

🚀 Features🧠 Model💻 Installation📘 Usage❓ Why LitePali🤝 Contributing 🏗 TODO


LitePali

LitePali is a lightweight document retrieval system I created, inspired by the ColPali model and optimized for cloud deployment. It's designed to efficiently process and search through document images using state-of-the-art vision-language models.

🚀Features

📦 Minimal dependencies 🖼️ Direct image processing without complex PDF parsing 🔄 Deterministic file processing ⚡ Batch processing for multiple files ☁️ Optimized for cloud environments

🧠Model

LitePali is built on the ColPali architecture, which uses Vision Language Models (VLMs) for efficient document retrieval.

Key features include:

  1. Late Interaction Mechanism: Enables efficient query matching while maintaining context.
  2. Multi-Vector Representations: Generates fine-grained representations of text and images.
  3. Visual and Textual Understanding: Processes document images directly, understanding both content and layout.
  4. Efficient Indexing: Faster corpus indexing compared to traditional PDF parsing methods.

ColPali Architecture

This approach allows LitePali to perform efficient retrieval while capturing complex document structures and content.

Inspiration

This library is inspired by byaldi, but with several key differences:

  • Focus on images: LitePali works exclusively with images, allowing PDF processing to be handled separately on CPU-only environments.
  • Simplified dependencies: No need for Poppler or other PDF-related dependencies.
  • Updated engine: Utilizes colpali-engine >=0.3.0 for improved performance.
  • Deterministic processing: Implements deterministic file processing for consistent results.
  • Efficient batch processing: Employs batch processing when adding multiple files, enhancing performance.
  • Customized functionality: Tailored for specific needs while building upon the excellent foundation laid by byaldi.

These differences make LitePali a more streamlined and focused tool for image-based document retrieval, offering flexibility in deployment and integration with existing PDF processing pipelines.

Installation

Install LitePali using pip:

pip install litepali

Usage

Here's a simple example of how to use LitePali:

from litepali import LitePali, ImageFile

# Initialize LitePali
litepali = LitePali()

# Add some images with metadata and page information
litepali.add(ImageFile(
    path="path/to/image1.jpg",
    document_id=1,
    page_id=1,
    metadata={"title": "Introduction", "author": "John Doe"}
))
litepali.add(ImageFile(
    path="path/to/image2.png",
    document_id=1,
    page_id=2,
    metadata={"title": "Results", "author": "John Doe"}
))
litepali.add(ImageFile(
    path="path/to/image3.jpg",
    document_id=2,
    page_id=1,
    metadata={"title": "Abstract", "author": "Jane Smith"}
))

# Process the added images
litepali.process()

# Perform a search
results = litepali.search("Your query here", k=5)

# Print results
for result in results:
    print(f"Image: {result['image'].path}, Score: {result['score']}")

# Save the index
litepali.save_index("path/to/save/index")

# Later, load the index
new_litepali = LitePali()
new_litepali.load_index("path/to/save/index")

This example demonstrates how to add images, process them, perform a search, and save/load the index.

Why LitePali?

I created LitePali to address the need for a lightweight, efficient document retrieval system that could work directly with images. By leveraging the power of vision-language models like ColPali, LitePali can understand both textual and visual elements in documents, making it ideal for complex document retrieval tasks.

LitePali is designed to be easy to use and deploy in cloud environments, making it a great choice for researchers and developers working on document retrieval systems.

Contributing

Contributions are welcome! Feel free to submit issues or pull requests if you have any improvements or bug fixes.

TODO

Future improvements and features planned for LitePali:

  • Enhanced index storage

    • Implement storage of base64-encoded versions of images within the index.
    • This will allow for quick retrieval and display of images without needing to access the original files.
  • Performance optimizations

    • Tests with flash-attention.
    • This optimization is expected to significantly speed up processing times, especially for large batches of images.
  • Quantization support

    • Add support for lower precision (e.g., int8, int4) to reduce memory footprint and increase inference speed.
  • API enhancements

    • Develop a more comprehensive API for advanced querying and filtering options.
  • Documentation expansion

    • Create more detailed documentation, including advanced usage examples and best practices.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

litepali-0.0.5.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

litepali-0.0.5-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file litepali-0.0.5.tar.gz.

File metadata

  • Download URL: litepali-0.0.5.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for litepali-0.0.5.tar.gz
Algorithm Hash digest
SHA256 937ca6a402a73f5d8d212c6f8b0372f579f0b8213ef2b7394eddf710ce7eab50
MD5 5f44fa0baa1fdad965c18f7c0c77bce9
BLAKE2b-256 fe8685b50544d69d359c0c42626279e9b352281ccc69d2fcbc95bc1409f6e6fe

See more details on using hashes here.

File details

Details for the file litepali-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: litepali-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for litepali-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 38144236b24a9632e8f1170720cbb0e14333524f642d604e846b3ac944a6815c
MD5 7b857039cf627da8dce72a2fb770021b
BLAKE2b-256 73958b8418308124071b62651fc9803e9be778100c0a629e76fca7c293fcf836

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page