AI-powered PDF parsing and retrieval with multiple subscription tiers for python
Project description
ServifAI 🚀
TO BE RELEASED SOON
AI-Powered PDF Parsing and Retrieval with Multiple Subscription Tiers
ServifAI is a powerful Python library that transforms PDF documents into intelligent, searchable knowledge bases using specialized AI engines optimized for different use cases.
✨ Features
-
🚀 Three Subscription Tiers:
- QUICKEST: Fastest processing
- BALANCED: Optimal speed/accuracy balance
- SECURED: Enterprise security
-
🔍 Advanced PDF Processing: Extracts text, images, tables with AI
-
🧠 Intelligent Search: Semantic search with citations and assets
-
🤖 LLM-Ready: Formatted outputs for seamless AI integration
-
☁️ Cloud-Powered: API-based architecture for scalability
-
🔒 Enterprise Security: SOC2 compliant with audit logs
🚀 Quick Start
Installation
# Install with uv (recommended)
uv add servifai
# Or with pip
pip install servifai
Setup
Create a .env
file:
SERVIFAI_API_KEY=sai_your_api_key_here
Get your API key from: https://servifai.syntheialabs.ai
Basic Usage
from servifai import ServifAI
# Initialize client
client = ServifAI()
# Create a session (optional but recommended for multiple documents)
session_id = client.create_session()
# Process PDFs
result = client.process_pdfs(
["document.pdf", "report.pdf"],
session_id=session_id
)
print(f"Processed {result.total_pages} pages")
print(f"Extracted {result.total_images} images, {result.total_tables} tables")
# Search with AI
search_result = client.search(
"financial metrics",
session_id=session_id,
top_k=5,
include_assets=True
)
# Get LLM-ready context
context = client.get_context_for_llm(search_result)
citations = client.get_citations(search_result)
# Use with any LLM
llm_prompt = f"""
Based on this context: {context}
Question: What are the key financial metrics?
Answer with citations: {citations}
"""
🎯 Subscription Tiers
Tier | Best For | Speed | Accuracy | Security |
---|---|---|---|---|
QUICKEST | Rapid prototyping | ⚡⚡⚡ | ⭐⭐⭐ | ⭐ |
BALANCED | Production apps | ⚡⚡ | ⭐⭐⭐ | ⭐⭐ |
SECURED | Enterprise | ⚡ | ⭐⭐ | ⭐⭐⭐ |
🤖 LLM Integration Examples
OpenAI Integration
import openai
from servifai import ServifAI
client = ServifAI()
# Process document(s)
session_id = client.create_session()
client.process_pdfs(["document.pdf"], session_id=session_id)
# Search and get context
search_result = client.search("key insights", session_id=session_id)
context = client.get_context_for_llm(search_result)
# Use with OpenAI
response = openai.ChatCompletion.create(
model="gpt-4",
messages=[{"role": "user", "content": f"Context: {context}\n\nQuestion: What are the key insights?"}]
)
Anthropic Claude
import anthropic
from servifai import ServifAI
client = ServifAI()
# Process document and search
session_id = client.create_session()
client.process_pdfs(["doc.pdf"], session_id=session_id)
search_result = client.search("analysis", session_id=session_id)
context = client.get_context_for_llm(search_result)
# Use with Anthropic Claude
claude = anthropic.Anthropic()
response = claude.messages.create(
model="claude-3-sonnet-20240229",
messages=[{"role": "user", "content": f"Analyze: {context}"}]
)
Local LLMs (Ollama)
import ollama
from servifai import ServifAI
client = ServifAI()
# Create session for document processing
session_id = client.create_session()
client.process_pdfs(["doc.pdf"], session_id=session_id)
# Search with limited context for local LLMs
search_result = client.search("summary", session_id=session_id, top_k=3)
context = client.get_context_for_llm(search_result, max_length=4000)
# Use with Ollama
response = ollama.chat(
model='llama2',
messages=[{'role': 'user', 'content': f"Summarize: {context}"}]
)
📊 Advanced Features
Session Management
# Create persistent session
session_id = client.create_session("my-analysis-2024")
# Process multiple batches
batch1 = client.process_pdfs(["q1-report.pdf"], session_id=session_id)
batch2 = client.process_pdfs(["q2-report.pdf"], session_id=session_id)
# Search across all documents in session
results = client.search("revenue growth", session_id=session_id)
# When done with the session, clean it up (optional)
# client.cleanup_session(session_id)
Tier Comparison
# Get information about your subscription
subscription_info = client.get_subscription_info()
print(f"Current tier: {subscription_info.tier}")
print(f"Document limit: {subscription_info.document_limit}")
print(f"Expiration: {subscription_info.expires_at}")
Asset Access
search_result = client.search("charts and graphs", include_assets=True)
for citation in search_result.citations:
for asset in citation.assets:
print(f"Found {asset.asset_type}: {asset.url}")
📚 API Reference
ServifAI Class
__init__(config=None, config_file=".env")
Initialize ServifAI client.
process_pdfs(pdf_files, session_id=None, show_progress=True)
Process PDF files with AI parsing.
- pdf_files: File path(s) to process
- session_id: Optional session ID to associate documents with
- show_progress: Whether to show processing progress
- Returns: ProcessingResult
search(query, session_id=None, top_k=5, include_assets=True)
Search documents with AI retrieval.
- query: Search query string
- session_id: Session ID to search within
- top_k: Number of results to return
- include_assets: Whether to include images and tables in results
- Returns: SearchResult with citations
get_context_for_llm(search_result, max_length=8000)
Format search results for LLM consumption.
- Returns: Formatted context string
get_citations(search_result)
Format citations as readable strings.
- Returns: List of citation strings
🔧 Configuration
Environment Variable | Description | Default |
---|---|---|
SERVIFAI_API_KEY |
Your ServifAI API key | Required |
SERVIFAI_API_URL |
API base URL | https://servifai.syntheialabs.ai/api/ |
SERVIFAI_TIMEOUT |
Request timeout (seconds) | 300 |
SERVIFAI_MAX_RETRIES |
Max retry attempts | 3 |
SERVIFAI_LOG_LEVEL |
Logging level | INFO |
💡 Use Cases
- 📈 Financial Analysis: Extract data from annual reports
- 📋 Legal Document Review: Parse contracts and agreements
- 🔬 Research Papers: Analyze academic publications
- 📊 Business Intelligence: Process market research reports
- 🏥 Medical Records: Extract patient information (HIPAA compliant)
- 📚 Educational Content: Create Q&A from textbooks
🛡️ Security & Compliance
- 🔒 SOC 2 Type II Certified
- 🛡️ GDPR & CCPA Compliant
- 🔐 End-to-end Encryption
- 📋 Audit Logs (Secured tier)
- 🏢 Enterprise SSO (Secured tier)
📈 Performance
Tier | Pages/Min | Accuracy | Use Case |
---|---|---|---|
QUICKEST | ~50 | ⭐⭐⭐ | Rapid prototyping, bulk processing |
BALANCED | ~25 | ⭐⭐⭐ | Production applications |
SECURED | ~15 | ⭐⭐ | Enterprise, compliance-critical |
🎯 Getting Started
- Sign up: https://servifai.syntheialabs.ai
- Get API key: Copy from dashboard
- Install library:
uv add servifai
- Create
.env
: Add your API key - Start coding: Process your first PDF!
🔗 Links
- Dashboard - Manage your account
- Documentation - Full API docs
- Examples - Code samples
- Support - Get help
📄 License
MIT License - see LICENSE file for details.
Ready to transform your PDFs into intelligent knowledge? Get started with ServifAI today! 🚀
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file servifai-1.1.0.tar.gz
.
File metadata
- Download URL: servifai-1.1.0.tar.gz
- Upload date:
- Size: 12.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
5b0fb6ebb24bd7fa47a45f53ab9d6338ba6d419248dd57b4c15db62e12e9620b
|
|
MD5 |
c0d2c02eb6a880639bda2ae8e1d9116f
|
|
BLAKE2b-256 |
cc8b7700da863becd59712ffb8a68c7562045fbd7596760f67534fe547917e91
|
File details
Details for the file servifai-1.1.0-py3-none-any.whl
.
File metadata
- Download URL: servifai-1.1.0-py3-none-any.whl
- Upload date:
- Size: 4.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
1c0e20c1bbee3388d8fea34db3ffbf700aa11d914a1b67b3faa9fed7659c83c0
|
|
MD5 |
694c522674a9f962a33b178eb3d82291
|
|
BLAKE2b-256 |
48acf0c7362bb06cd6a5ec33722f832141eb9ca4ec03992da93625b17878af4f
|