Skip to main content

Official Python SDK for Cerevox - The Data Layer for AI Agents: data parsing (Lexa) and data search (Hippo)

Project description

Cerevox Logo

Cerevox - The Data Layer for AI Agents 🧠 ⚡

Data Parsing (Lexa) • Data Search (Hippo) • Enterprise-grade • Built for AI
AI-powered • Highest Accuracy • Vector DB ready

CI Status Code Coverage Maintainability PyPI version Python versions License

Official Python SDK for:

  • Lexa - Parse documents into structured data
    • 🎯 Perfect for: RAG applications, document analysis, data extraction, and vector database preparation

  • Hippo - Search and query your document collections
    • 🎯 Perfect for: AI-powered Q&A, semantic search, and drawing insights from document collections

  • Account - Enterprise user management and authentication
    • 🎯 Perfect for: User authentication, account management, and usage tracking

Table of Contents

📦 Installation

pip install cerevox

📋 Requirements

🚀 Lexa Quick Start

Basic Usage

from cerevox import Lexa

# Parse a document
client = Lexa(api_key="your-api-key")
documents = client.parse(["document.pdf"])

print(f"Extracted {len(documents[0].content)} characters")
print(f"Found {len(documents[0].tables)} tables")

Async Processing (Recommended)

import asyncio
from cerevox import AsyncLexa

async def main():
    async with AsyncLexa(api_key="your-api-key") as client:
        documents = await client.parse(["document.pdf", "report.docx"])
        
        # Get chunks optimized for vector databases
        chunks = documents.get_all_text_chunks(target_size=500)
        print(f"Ready for embedding: {len(chunks)} chunks")

asyncio.run(main())

🚀 Hippo Getting Started

  • Create Folder and Upload Files
  • Start Chat to Ask Questions on the Folder Data

See guide hippo-getting-started.md

✨ Features

🚀 Performance & Scale

  • 10x Faster than traditional solutions
  • Native Async Support with concurrent processing
  • Enterprise-grade reliability with automatic retries

🧠 AI-Powered Extraction

  • SOTA Accuracy with cutting-edge ML models
  • Advanced Table Extraction preserving structure and formatting
  • 12+ File Formats including PDF, DOCX, PPTX, HTML, and more

🔗 Integration Ready

  • Vector Database Optimized chunks for RAG applications
  • 7+ Cloud Storage integrations (S3, SharePoint, Google Drive, etc.)
  • Framework Agnostic works with Django, Flask, FastAPI
  • Rich Metadata extraction including images, formatting, and structure

📋 Examples

Explore comprehensive examples in the examples/ directory:

Lexa

Example Description
lexa_examples.py Complete SDK functionality demonstration
lexa_async_examples.py Advanced async processing techniques
lexa_cloud_integrations.py Cloud storage service integrations

Document

Example Description
document_examples.py Document analysis and manipulation features
document_vector_db_preparation.py Vector database chunking and integration patterns

🚀 Run Examples

# Clone and explore
git clone https://github.com/CerevoxAI/cerevox-python.git
cd cerevox-python

export CEREVOX_API_KEY="your-api-key"

# Run demos
python examples/lexa_examples.py            # Basic usage
python examples/lexa_async_examples.py      # Async features
python examples/lexa_cloud_integrations.py  # Cloud Integrations Coming Soon!

python examples/document_examples.py               # Document analysis
python examples/document_vector_db_preparation.py  # Vector DB integration

📚 Documentation

📖 API References

📖 Guides & Tutorials

🔗 External Resources

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🆘 Support & Community

📖 Resources

💬 Get Help

🐛 Issues


⭐ Star us on GitHub if Cerevox helped your project!
Made with ❤️ by the Cerevox team
Happy Building! 🔍 🦛 ✨

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cerevox-0.2.0.tar.gz (98.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cerevox-0.2.0-py3-none-any.whl (110.1 kB view details)

Uploaded Python 3

File details

Details for the file cerevox-0.2.0.tar.gz.

File metadata

  • Download URL: cerevox-0.2.0.tar.gz
  • Upload date:
  • Size: 98.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for cerevox-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ce92c0ed3d6d59a68a97c9ecc2fac0e4af8114bcfff0daccfafea576b97c6d50
MD5 f56bb9c5141eac243bd7a3aa7fc40c0d
BLAKE2b-256 93656186b3628998b4c53ba9939dcfdaa67994f54fa034423efab111fbf9e19e

See more details on using hashes here.

File details

Details for the file cerevox-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: cerevox-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 110.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for cerevox-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9a1896ac4a2ebbb520599bdd7d5dd287ed7ad72a3b58077c2f59db7d8f4f8d8f
MD5 c266f631e51f09f9bd085ffdda742553
BLAKE2b-256 15733ee27df3a15707e8116480deffc7903c02457c55faee2794d5667019daee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page