A lightweight, vision-based document question-answering system

These details have not been verified by PyPI

Project links

Project description

DocPixie

A lightweight multimodal RAG (Retrieval-Augmented Generation) library that uses vision AI instead of traditional embeddings or vector databases. DocPixie processes documents as images and uses vision language models for both document understanding and intelligent page selection.

🌟 Features

Vision-First Approach: Documents processed as images using PyMuPDF, preserving visual information and formatting
No Vector Database Required: Eliminates the complexity of embeddings and vector storage
Adaptive RAG Agent: Single intelligent agent that dynamically plans tasks and selects relevant pages
Multi-Provider Support: Works with OpenAI GPT-4V, Anthropic Claude, and OpenRouter
Modern CLI Interface: Beautiful terminal UI built with Textual
Conversation Aware: Maintains context across multiple queries
Pluggable Storage: Local filesystem or in-memory storage backends

🚀 Quick Start

Installation

# Clone the repository
git clone https://github.com/qnguyen3/docpixie.git

# Install dependencies
pip install -r requirements.txt

# Or use uv (recommended)
uv pip install -r requirements.txt

Basic Usage

import asyncio
from docpixie import DocPixie

async def main():
    # Initialize with your API key
    docpixie = DocPixie()

    # Add a document
    document = await docpixie.add_document("path/to/your/document.pdf")
    print(f"Added document: {document.name}")

    # Query the document
    result = await docpixie.query("What are the key findings?")
    print(f"Answer: {result.answer}")
    print(f"Pages used: {result.page_numbers}")

# Run the example
asyncio.run(main())

Using the CLI

Start the interactive terminal interface:

python -m docpixie.cli

The CLI provides:

Interactive document chat
Document management
Conversation history
Model configuration
Command palette with shortcuts

🛠️ Configuration

DocPixie uses environment variables for API key configuration:

# For OpenAI (default)
export OPENAI_API_KEY="your-openai-key"

# For Anthropic Claude
export ANTHROPIC_API_KEY="your-anthropic-key"

# For OpenRouter (supports many models)
export OPENROUTER_API_KEY="your-openrouter-key"

You can also specify the provider:

from docpixie import DocPixie, DocPixieConfig

config = DocPixieConfig(
    provider="anthropic",  # or "openai", "openrouter"
    model="claude-3-opus-20240229",
    vision_model="claude-3-opus-20240229"
)

docpixie = DocPixie(config=config)

📚 Supported File Types

PDF files (.pdf) - Full multipage support
More file types coming soon

🏗️ Architecture

DocPixie uses a clean, modular architecture:

📁 Core Components
├── 🧠 Adaptive RAG Agent - Dynamic task planning and execution
├── 👁️  Vision Processing - Document-to-image conversion via PyMuPDF
├── 🔌 Provider System - Unified interface for AI providers
├── 💾 Storage Backends - Local filesystem or in-memory storage
└── 🖥️  CLI Interface - Modern terminal UI with Textual

📁 Processing Flow
1. Document → Images (PyMuPDF)
2. Vision-based summarization
3. Adaptive query processing
4. Intelligent page selection
5. Response synthesis

Key Design Principles

Provider-Agnostic: Generic model configuration works across all providers
Image-Based Processing: All documents converted to images, preserving visual context
Business Logic Separation: Raw API operations separate from workflow logic
Adaptive Intelligence: Single agent mode that dynamically adjusts based on findings

🎯 Use Cases

Research & Analysis: Query academic papers, reports, and research documents
Document Q&A: Interactive questioning of PDFs, contracts, and manuals
Content Discovery: Find specific information across large document collections
Visual Document Processing: Handle documents with charts, diagrams, and complex layouts

🔧 Development

Setup Development Environment

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # or `.venv\Scripts\activate` on Windows

# Install dependencies
pip install -r requirements.txt

# Run tests
python -m pytest tests/ -v

🌍 Environment Variables

Variable	Description	Default
`OPENAI_API_KEY`	OpenAI API key	None
`ANTHROPIC_API_KEY`	Anthropic API key	None
`OPENROUTER_API_KEY`	OpenRouter API key	None
`DOCPIXIE_PROVIDER`	AI provider	`openai`
`DOCPIXIE_STORAGE_PATH`	Storage directory	`./docpixie_data`
`DOCPIXIE_JPEG_QUALITY`	Image quality (1-100)	`90`

📖 Documentation

Getting Started Guide - Detailed examples and tutorials
CLI Tool Guide - Complete CLI documentation

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Built with PyMuPDF for PDF processing
CLI powered by Textual
Supports OpenAI, Anthropic, and OpenRouter APIs

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Sep 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docpixie-0.1.0.tar.gz (88.0 kB view details)

Uploaded Sep 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

docpixie-0.1.0-py3-none-any.whl (106.1 kB view details)

Uploaded Sep 3, 2025 Python 3

File details

Details for the file docpixie-0.1.0.tar.gz.

File metadata

Download URL: docpixie-0.1.0.tar.gz
Upload date: Sep 3, 2025
Size: 88.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for docpixie-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`e8305e00c7590e613a117d525a2777a60b72e700898922e8a7811b6abbea42e4`
MD5	`cebf7a66712bf487118dbf4cc45e4e90`
BLAKE2b-256	`9cbe364cf20b3b2ea7309bb12733a68bc816b4f15ffab018fec53035308c6ba6`

See more details on using hashes here.

File details

Details for the file docpixie-0.1.0-py3-none-any.whl.

File metadata

Download URL: docpixie-0.1.0-py3-none-any.whl
Upload date: Sep 3, 2025
Size: 106.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.17

File hashes

Hashes for docpixie-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`371fa471169a2e5703b64e52bdebb0bee0af63f9f8c865987e54f955862a1be2`
MD5	`e590e74d0dd7f6617e13cbfaf4207781`
BLAKE2b-256	`8be1599cb3d0dc7fcd2231f62eccd107d521975d6e1da66abea87f5d95adbbf4`

See more details on using hashes here.

docpixie 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

DocPixie

🌟 Features

🚀 Quick Start

Installation

Basic Usage

Using the CLI

🛠️ Configuration

📚 Supported File Types

🏗️ Architecture

Key Design Principles

🎯 Use Cases

🔧 Development

Setup Development Environment

🌍 Environment Variables

📖 Documentation

🤝 Contributing

📄 License

🙏 Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes