Skip to main content

A production-ready, plug-and-play Python SDK for building intelligent RAG systems

Project description

Lexora Agentic RAG SDK

Production-ready Agentic RAG SDK with minimal configuration

Python 3.8+ License: MIT Tests

Quick StartDocumentationExamplesAPI Reference


🚀 What is Lexora?

Lexora is a production-ready Agentic RAG (Retrieval-Augmented Generation) SDK that makes it easy to build intelligent applications with semantic search and AI-powered reasoning. With just a few lines of code, you can:

  • 📚 Create and manage document corpora
  • 🔍 Perform semantic search across your documents
  • 🤖 Build AI agents that reason over your data
  • 🛠️ Extend functionality with custom tools
  • 🎯 Deploy to production with confidence

✨ Key Features

  • Zero-Config Setup: Get started in minutes with sensible defaults
  • Multiple Vector Databases: Support for FAISS, Pinecone, and Chroma
  • Flexible Embeddings: OpenAI, HuggingFace, Gemini, or custom providers
  • Flexible LLM Integration: Works with any LLM via LiteLLM
  • Built-in RAG Tools: 10+ pre-built tools for document management
  • Custom Tool Support: Easily add your own tools
  • Production-Ready: Comprehensive error handling, logging, and testing
  • Type-Safe: Full type hints and Pydantic validation
  • Cost-Effective: Free embedding options available

📦 Installation

Prerequisites

  • Python 3.8 or higher
  • pip or conda package manager

Install from Source

# Clone the repository
git clone https://github.com/yourusername/lexora.git
cd lexora

# Create a virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Install in development mode
pip install -e .

Install from PyPI (Coming Soon)

pip install lexora

🎯 Quick Start

Basic Usage

from lexora import RAGAgent

# Initialize the agent with defaults
agent = RAGAgent()

# Create a document corpus
await agent.tool_registry.get_tool("create_corpus").run(
    corpus_name="my_docs",
    description="My document collection"
)

# Add documents
documents = [
    {"content": "Python is a programming language.", "metadata": {"topic": "python"}},
    {"content": "Machine learning is a subset of AI.", "metadata": {"topic": "ml"}}
]

await agent.tool_registry.get_tool("add_data").run(
    corpus_name="my_docs",
    documents=documents
)

# Query your documents
result = await agent.tool_registry.get_tool("rag_query").run(
    corpus_name="my_docs",
    query="What is Python?",
    top_k=5
)

print(result.data["results"])

Using the Agent for Reasoning

# Ask questions and get AI-powered answers
response = await agent.query("Explain machine learning in simple terms")

print(f"Answer: {response.answer}")
print(f"Confidence: {response.confidence}")
print(f"Sources: {len(response.sources)}")

📖 Documentation

Table of Contents

  1. Installation Guide
  2. Configuration
  3. Core Concepts
  4. RAG Tools
  5. Custom Tools
  6. Vector Databases
  7. LLM Integration
  8. Error Handling
  9. Best Practices
  10. API Reference

⚙️ Configuration

Lexora supports multiple configuration methods:

1. Default Configuration (Easiest)

from lexora import RAGAgent

# Uses mock LLM and FAISS vector database
agent = RAGAgent()

2. YAML Configuration

# config.yaml
llm:
  provider: "openai"
  model: "gpt-4"
  api_key: "${OPENAI_API_KEY}"
  temperature: 0.7

vector_db:
  provider: "faiss"
  embedding_model: "text-embedding-ada-002"
  dimension: 1536
  connection_params:
    storage_path: "./vector_storage"

agent:
  max_iterations: 5
  enable_reasoning: true
  log_level: "INFO"
from lexora import RAGAgent

agent = RAGAgent.from_yaml("config.yaml")

3. Environment Variables

# .env file
LEXORA_LLM_PROVIDER=openai
LEXORA_LLM_MODEL=gpt-4
LEXORA_LLM_API_KEY=your-api-key
LEXORA_VECTORDB_PROVIDER=faiss
LEXORA_VECTORDB_EMBEDDING_MODEL=text-embedding-ada-002
from lexora import RAGAgent

agent = RAGAgent.from_env()

4. Programmatic Configuration

from lexora import RAGAgent, LLMConfig, VectorDBConfig, AgentConfig

agent = RAGAgent(
    llm_config=LLMConfig(
        provider="openai",
        model="gpt-4",
        api_key="your-api-key"
    ),
    vector_db_config=VectorDBConfig(
        provider="faiss",
        embedding_model="text-embedding-ada-002",
        dimension=1536
    ),
    agent_config=AgentConfig(
        max_iterations=5,
        enable_reasoning=True
    )
)

🎨 Embedding Options

Important: You are NOT limited to OpenAI embeddings! Lexora supports multiple embedding providers.

Available Options

Provider Cost Quality Privacy Best For
HuggingFace Free High ✅ Local Production (recommended)
OpenAI Paid Highest ❌ Cloud Enterprise
Gemini Free tier High ❌ Cloud Gemini users
Mock Free Low ✅ Local Testing

Quick Examples

1. Free Local Embeddings (Recommended)

# Install sentence-transformers
# pip install sentence-transformers

from lexora import RAGAgent
from lexora.models.config import VectorDBConfig

agent = RAGAgent(
    vector_db_config=VectorDBConfig(
        provider="faiss",
        dimension=384,  # all-MiniLM-L6-v2 dimension
        connection_params={
            "index_type": "Flat",
            "persist_directory": "./vector_db"
        }
    )
)

2. OpenAI Embeddings

from lexora import RAGAgent
from lexora.models.config import VectorDBConfig

agent = RAGAgent(
    vector_db_config=VectorDBConfig(
        provider="faiss",
        dimension=1536,  # OpenAI dimension
        connection_params={
            "embedding_model": "text-embedding-ada-002",
            "openai_api_key": "your-key"
        }
    )
)

3. Custom Embedding Provider

from lexora.utils.embeddings import BaseEmbeddingProvider
from sentence_transformers import SentenceTransformer

class HuggingFaceProvider(BaseEmbeddingProvider):
    def __init__(self, model_name="all-MiniLM-L6-v2"):
        self.model = SentenceTransformer(model_name)
    
    async def generate_embedding(self, text: str):
        return self.model.encode(text).tolist()
    
    def get_dimension(self) -> int:
        return self.model.get_sentence_embedding_dimension()

📚 Full Embedding Guide - Detailed documentation on all embedding options


🧩 Core Concepts

Document Corpus

A corpus is a collection of documents that can be searched semantically.

# Create a corpus
await agent.tool_registry.get_tool("create_corpus").run(
    corpus_name="knowledge_base",
    description="Company knowledge base",
    metadata={"department": "engineering"}
)

Documents

Documents are the basic unit of information in Lexora.

document = {
    "content": "Your document text here",
    "metadata": {
        "source": "documentation",
        "author": "John Doe",
        "date": "2024-01-01"
    }
}

Semantic Search

Search documents by meaning, not just keywords.

results = await agent.tool_registry.get_tool("rag_query").run(
    corpus_name="knowledge_base",
    query="How do I deploy to production?",
    top_k=5,
    min_score=0.7
)

🛠️ RAG Tools

Lexora comes with 10+ built-in tools:

Core Tools

Tool Description
create_corpus Create a new document corpus
add_data Add documents to a corpus
rag_query Search documents semantically
list_corpora List all available corpora
get_corpus_info Get detailed corpus information
delete_corpus Delete a corpus
delete_document Delete a specific document
update_document Update an existing document
bulk_add_data Add large batches of documents
health_check Check system health

Tool Usage Examples

See examples/ directory for detailed examples of each tool.


🔧 Custom Tools

Extend Lexora with your own tools:

from lexora import BaseTool, ToolParameter

class WeatherTool(BaseTool):
    @property
    def name(self) -> str:
        return "get_weather"
    
    @property
    def description(self) -> str:
        return "Get current weather for a location"
    
    @property
    def version(self) -> str:
        return "1.0.0"
    
    def _setup_parameters(self) -> None:
        self._parameters = [
            ToolParameter(
                name="location",
                type="string",
                description="City name",
                required=True
            )
        ]
    
    async def _execute(self, location: str, **kwargs):
        # Your implementation here
        return {"temperature": 72, "condition": "sunny"}

# Register the tool
agent.add_tool(WeatherTool())

💾 Vector Databases

FAISS (Default)

from lexora import RAGAgent, VectorDBConfig

agent = RAGAgent(
    vector_db_config=VectorDBConfig(
        provider="faiss",
        embedding_model="text-embedding-ada-002",
        dimension=1536,
        connection_params={"storage_path": "./faiss_storage"}
    )
)

Pinecone

agent = RAGAgent(
    vector_db_config=VectorDBConfig(
        provider="pinecone",
        embedding_model="text-embedding-ada-002",
        dimension=1536,
        connection_params={
            "api_key": "your-pinecone-key",
            "environment": "us-west1-gcp"
        }
    )
)

Chroma

agent = RAGAgent(
    vector_db_config=VectorDBConfig(
        provider="chroma",
        embedding_model="text-embedding-ada-002",
        dimension=1536,
        connection_params={"persist_directory": "./chroma_storage"}
    )
)

🤖 LLM Integration

Lexora uses LiteLLM for universal LLM support:

OpenAI

from lexora import LLMConfig

llm_config = LLMConfig(
    provider="openai",
    model="gpt-4",
    api_key="your-api-key",
    temperature=0.7
)

Anthropic Claude

llm_config = LLMConfig(
    provider="anthropic",
    model="claude-3-opus-20240229",
    api_key="your-api-key"
)

Azure OpenAI

llm_config = LLMConfig(
    provider="azure",
    model="gpt-4",
    api_key="your-api-key",
    api_base="https://your-resource.openai.azure.com/"
)

🚨 Error Handling

Lexora provides structured error handling:

result = await agent.tool_registry.get_tool("rag_query").run(
    corpus_name="nonexistent",
    query="test"
)

if result.status == "error":
    print(f"Error: {result.error}")
    # Error includes context and suggestions

All errors include:

  • Error code
  • Descriptive message
  • Context information
  • Helpful suggestions

📚 Examples

Check out the examples/ directory for complete examples:

  • 01_quick_start.py - Basic usage
  • 02_custom_configuration.py - Configuration options
  • 03_corpus_management.py - Managing corpora
  • 04_custom_tools.py - Creating custom tools
  • rag_tools_demo.py - All RAG tools
  • rag_agent_with_real_embeddings.py - Production setup

🧪 Testing

Run the test suite:

# Run all tests
python run_tests.py

# Run specific test file
python tests/test_error_handling.py

# Run with pytest
pytest tests/ -v

📊 Performance

  • Query Speed: < 1ms for small corpora
  • Batch Processing: 12,000+ documents/second
  • Concurrent Queries: 10 queries in 5ms
  • Memory Efficient: Handles 200+ documents in batches

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.


📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


🆘 Support


🗺️ Roadmap

  • PyPI package distribution
  • Additional vector database support
  • Streaming responses
  • Multi-modal support (images, audio)
  • Advanced caching strategies
  • Distributed deployment support

🙏 Acknowledgments

Built with:


Made with ❤️ by the Lexora Team

⭐ Star us on GitHub📖 Read the Docs🐦 Follow us on Twitter

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lexora-0.1.0.tar.gz (32.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lexora-0.1.0-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file lexora-0.1.0.tar.gz.

File metadata

  • Download URL: lexora-0.1.0.tar.gz
  • Upload date:
  • Size: 32.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for lexora-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9765e9ab4355bb8b029b30145e7818c69663e256b91a70dd97cc7cc30de4e5bb
MD5 ab919d11ea6e74d3458e107f7c5ecf95
BLAKE2b-256 d89f52c70c6aa389948e4fe199a3f7886410a26d4f5cf440104f01b8a094bf86

See more details on using hashes here.

File details

Details for the file lexora-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: lexora-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for lexora-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e08f19786331acd53e8e91e7a34630b79d5e9e5494ff0baee5bf44aa1f8a4812
MD5 ab6d2c2655744a07e453c0c90bef7ee1
BLAKE2b-256 974b58dbab76f1f22b2f4eb2ccac27912647c4a3a8d530348816d957da73f739

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page