Official Python SDK for Cerevox - The Data Layer for AI Agents: data parsing (Lexa) and data search (Hippo)
Project description
Cerevox - The Data Layer for AI Agents 🧠 ⚡
Data Parsing (Lexa) • Data Search (Hippo) • Enterprise-grade • Built for AI
AI-powered • Highest Accuracy • Vector DB ready
Official Python SDK for:
- Lexa - Parse documents into structured data
-
🎯 Perfect for: RAG applications, document analysis, data extraction, and vector database preparation
-
- Hippo - Search and query your document collections
-
🎯 Perfect for: AI-powered Q&A, semantic search, and drawing insights from document collections
-
- Account - Enterprise user management and authentication
-
🎯 Perfect for: User authentication, account management, and usage tracking
-
Table of Contents
📦 Installation
pip install cerevox
📋 Requirements
- Python 3.9+
- API key from Cerevox
🚀 Lexa Quick Start
Basic Usage
from cerevox import Lexa
# Parse a document
client = Lexa(api_key="your-api-key")
documents = client.parse(["document.pdf"])
print(f"Extracted {len(documents[0].content)} characters")
print(f"Found {len(documents[0].tables)} tables")
Async Processing (Recommended)
import asyncio
from cerevox import AsyncLexa
async def main():
async with AsyncLexa(api_key="your-api-key") as client:
documents = await client.parse(["document.pdf", "report.docx"])
# Get chunks optimized for vector databases
chunks = documents.get_all_text_chunks(target_size=500)
print(f"Ready for embedding: {len(chunks)} chunks")
asyncio.run(main())
🚀 Hippo Getting Started
- Create Folder and Upload Files
- Start Chat to Ask Questions on the Folder Data
See guide hippo-getting-started.md
✨ Features
🚀 Performance & Scale
- 10x Faster than traditional solutions
- Native Async Support with concurrent processing
- Enterprise-grade reliability with automatic retries
🧠 AI-Powered Extraction
- SOTA Accuracy with cutting-edge ML models
- Advanced Table Extraction preserving structure and formatting
- 12+ File Formats including PDF, DOCX, PPTX, HTML, and more
🔗 Integration Ready
- Vector Database Optimized chunks for RAG applications
- 7+ Cloud Storage integrations (S3, SharePoint, Google Drive, etc.)
- Framework Agnostic works with Django, Flask, FastAPI
- Rich Metadata extraction including images, formatting, and structure
📋 Examples
Explore comprehensive examples in the examples/ directory:
Lexa
| Example | Description |
|---|---|
lexa_examples.py |
Complete SDK functionality demonstration |
lexa_async_examples.py |
Advanced async processing techniques |
lexa_cloud_integrations.py |
Cloud storage service integrations |
Document
| Example | Description |
|---|---|
document_examples.py |
Document analysis and manipulation features |
document_vector_db_preparation.py |
Vector database chunking and integration patterns |
🚀 Run Examples
# Clone and explore
git clone https://github.com/CerevoxAI/cerevox-python.git
cd cerevox-python
export CEREVOX_API_KEY="your-api-key"
# Run demos
python examples/lexa_examples.py # Basic usage
python examples/lexa_async_examples.py # Async features
python examples/lexa_cloud_integrations.py # Cloud Integrations Coming Soon!
python examples/document_examples.py # Document analysis
python examples/document_vector_db_preparation.py # Vector DB integration
📚 Documentation
📖 API References
- API Reference - Complete API documentation
📖 Guides & Tutorials
- Vector Database Integration - RAG and vector DB setup
- Advanced Examples - Real-world usage patterns
- Migration Guide - Migrate from other tools
🔗 External Resources
- Full Documentation - Comprehensive guides
- Interactive API Docs - Try the API
- Discord Community - Get help and discuss
🤝 Contributing
We welcome contributions! Please see our Contributing Guide for details.
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🆘 Support & Community
|
📖 Resources |
💬 Get Help |
🐛 Issues |
⭐ Star us on GitHub if Cerevox helped your project!
Made with ❤️ by the Cerevox team
Happy Building! 🔍 🦛 ✨
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cerevox-0.2.0.tar.gz.
File metadata
- Download URL: cerevox-0.2.0.tar.gz
- Upload date:
- Size: 98.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ce92c0ed3d6d59a68a97c9ecc2fac0e4af8114bcfff0daccfafea576b97c6d50
|
|
| MD5 |
f56bb9c5141eac243bd7a3aa7fc40c0d
|
|
| BLAKE2b-256 |
93656186b3628998b4c53ba9939dcfdaa67994f54fa034423efab111fbf9e19e
|
File details
Details for the file cerevox-0.2.0-py3-none-any.whl.
File metadata
- Download URL: cerevox-0.2.0-py3-none-any.whl
- Upload date:
- Size: 110.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9a1896ac4a2ebbb520599bdd7d5dd287ed7ad72a3b58077c2f59db7d8f4f8d8f
|
|
| MD5 |
c266f631e51f09f9bd085ffdda742553
|
|
| BLAKE2b-256 |
15733ee27df3a15707e8116480deffc7903c02457c55faee2794d5667019daee
|