Skip to main content

Production-grade Retrieval-Augmented Generation package

Project description

Socratic RAG

PyPI version License: MIT Python 3.8+ Test Coverage

Why Socratic RAG?

Building reliable retrieval systems is complex. Socratic RAG provides production-grade RAG:

  • Multiple Vector Stores - ChromaDB, Qdrant, FAISS, Pinecone with extensible provider pattern
  • Flexible Embeddings - Sentence Transformers or OpenAI embeddings via Socrates Nexus
  • Smart Chunking - Fixed-size, semantic, and recursive strategies for optimal chunk processing
  • Document Processing - Handle Text, PDF, Markdown, and code files out of the box
  • Async Support - Full async/await interface for non-blocking operations at scale

Production-grade Retrieval-Augmented Generation (RAG) package for Python.

Features

  • Multiple Vector Databases: ChromaDB, Qdrant, FAISS, Pinecone (with extensible provider pattern)
  • Flexible Embedding Providers: Sentence Transformers, OpenAI (via Socrates Nexus)
  • Smart Chunking: Fixed-size, semantic, and recursive chunking strategies
  • Document Processing: Text, PDF, Markdown, and code files
  • Framework Integrations: Openclaw skills and LangChain components
  • Async Support: Full async/await interface for non-blocking operations
  • Production Ready: 70%+ test coverage, type hints, comprehensive documentation
  • Part of Socrates Ecosystem: Built on Socrates Nexus for LLM integration

Part of the Socrates Ecosystem

Socratic RAG is a core component of the Socrates Ecosystem - a collection of production-grade AI packages that work together.

How It Uses Socrates Nexus

  • Embedded document generation uses Socrates Nexus for embeddings
  • Answer generation uses Socrates Nexus for LLM calls (multi-provider support)
  • Works with any Socrates Nexus provider (Claude, GPT-4, Gemini, Ollama)

Related Packages in the Ecosystem

👉 Full ecosystem guide: See Socrates Nexus ECOSYSTEM.md

📊 Track development: View the Socrates Ecosystem Roadmap to see progress across all packages

Installation

Basic Installation

pip install socratic-rag

With Optional Dependencies

# All features
pip install socratic-rag[all]

# Specific vector stores
pip install socratic-rag[chromadb,qdrant,faiss]

# Document processing
pip install socratic-rag[pdf,markdown]

# Integrations
pip install socratic-rag[langchain,openclaw,nexus]

# Development
pip install socratic-rag[dev]

Quick Start

Basic Usage

from socratic_rag import RAGClient

# Initialize client
client = RAGClient()

# Add documents
doc_id = client.add_document(
    content="Python is a programming language created by Guido van Rossum.",
    source="python_facts.txt"
)

# Search
results = client.search("What is Python?", top_k=5)
for result in results:
    print(f"Score: {result.score:.2f}")
    print(f"Text: {result.chunk.text}\n")

# Retrieve formatted context for LLM
context = client.retrieve_context("What is Python?")
print(context)

Custom Configuration

from socratic_rag import RAGClient, RAGConfig

config = RAGConfig(
    vector_store="chromadb",
    embedder="sentence-transformers",
    chunking_strategy="fixed",
    chunk_size=512,
    chunk_overlap=50,
    top_k=5,
)

client = RAGClient(config)

Async Usage

import asyncio
from socratic_rag import AsyncRAGClient

async def main():
    client = AsyncRAGClient()

    # Add documents asynchronously
    doc_id = await client.add_document(
        content="Document content",
        source="source.txt"
    )

    # Search asynchronously
    results = await client.search("query")

    # Retrieve context asynchronously
    context = await client.retrieve_context("query")

asyncio.run(main())

Architecture

Provider Pattern

Socratic RAG uses an extensible provider pattern for easy integration of new components:

RAGClient
├── Embedder Provider (sentence-transformers, OpenAI, etc.)
├── Chunker Provider (fixed, semantic, recursive)
└── Vector Store Provider (ChromaDB, Qdrant, FAISS, Pinecone)

Core Models

  • Document: Raw document with metadata
  • Chunk: Text chunk with position and metadata
  • SearchResult: Search result with relevance score
  • RAGConfig: Configuration object

Support Development

If you find this package useful, consider supporting development:

Your support helps fund development of the entire Socratic ecosystem.

Examples

See the examples/ directory for:

  1. 01_basic_rag.py - Basic RAG workflow
  2. 02_qdrant_rag.py - Using Qdrant vector store
  3. 03_faiss_rag.py - Using FAISS vector store
  4. 04_document_processing.py - Document processing
  5. 05_openclaw_integration.py - Openclaw skill integration
  6. 06_langchain_integration.py - LangChain integration
  7. 07_llm_powered_rag.py - RAG with LLM (using Socrates Nexus)

Testing

Run the test suite:

pytest tests/ -v --cov=socratic_rag

Specific test categories:

# Unit tests only
pytest tests/ -m unit

# Integration tests only
pytest tests/ -m integration

# Exclude slow tests
pytest tests/ -m "not slow"

Vector Store Providers

ChromaDB (Default)

from socratic_rag import RAGClient, RAGConfig

config = RAGConfig(vector_store="chromadb")
client = RAGClient(config)

Qdrant

config = RAGConfig(vector_store="qdrant")
client = RAGClient(config)

FAISS

config = RAGConfig(vector_store="faiss")
client = RAGClient(config)

Embedding Providers

Sentence Transformers (Default)

config = RAGConfig(embedder="sentence-transformers")
client = RAGClient(config)

OpenAI (via Socrates Nexus)

config = RAGConfig(embedder="openai")
client = RAGClient(config)

Document Processing

Text Files

from socratic_rag.processors import TextProcessor

processor = TextProcessor()
documents = processor.process("path/to/file.txt")

PDF Files

from socratic_rag.processors import PDFProcessor

processor = PDFProcessor()
documents = processor.process("path/to/file.pdf")

Markdown Files

from socratic_rag.processors import MarkdownProcessor

processor = MarkdownProcessor()
documents = processor.process("path/to/file.md")

Framework Integrations

Openclaw

from socratic_rag.integrations.openclaw import SocraticRAGSkill

skill = SocraticRAGSkill(vector_store="chromadb")
skill.add_document("content", "source.txt")
results = skill.search("query")

LangChain

from socratic_rag.integrations.langchain import SocraticRAGRetriever
from langchain.chat_models import ChatAnthropic
from langchain.chains import RetrievalQA

retriever = SocraticRAGRetriever(client=rag_client, top_k=5)
llm = ChatAnthropic(model="claude-sonnet")
qa = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)

answer = qa.run("What is Python?")

Documentation

  • Caching Strategy Guide - Complete guide to performance optimization with embedding caching and search result caching, including configuration, memory management, and best practices

API Reference

RAGClient

Methods

  • add_document(content, source, metadata=None) -> str - Add document to knowledge base
  • search(query, top_k=None, filters=None) -> List[SearchResult] - Search for relevant documents
  • retrieve_context(query, top_k=None) -> str - Retrieve formatted context for LLM
  • clear() -> bool - Clear all documents

AsyncRAGClient

Same methods as RAGClient but with async/await.

Configuration Options

RAGConfig(
    vector_store="chromadb",           # Vector store provider
    embedder="sentence-transformers",  # Embedding provider
    chunking_strategy="fixed",         # Chunking strategy
    chunk_size=512,                    # Characters per chunk
    chunk_overlap=50,                  # Overlap between chunks
    top_k=5,                           # Default number of results
    embedding_cache=True,              # Cache embeddings
    cache_ttl=3600,                    # Cache TTL in seconds
    collection_name="socratic_rag",    # Collection name
)

Exceptions

  • SocraticRAGError - Base exception
  • ConfigurationError - Configuration validation error
  • VectorStoreError - Vector store operation error
  • EmbeddingError - Embedding operation error
  • ChunkingError - Chunking operation error
  • ProcessorError - Document processing error
  • DocumentNotFoundError - Document not found
  • ProviderNotFoundError - Provider not found

Performance Tips

  1. Use appropriate chunk size: Smaller chunks (256-512) for dense retrieval, larger (1024+) for sparse
  2. Set overlap wisely: 10-15% overlap usually works well
  3. Cache embeddings: Enable embedding cache for repeated queries
  4. Batch operations: Use embed_batch() for multiple embeddings
  5. Choose right embedder: Sentence Transformers for local, OpenAI for production scale

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

MIT License - see LICENSE file for details

Citation

If you use Socratic RAG in your research, please cite:

@software{socratic_rag_2024,
  title={Socratic RAG: Production-grade Retrieval-Augmented Generation},
  author={Your Name},
  year={2024},
  url={https://github.com/Nireus79/Socratic-rag}
}

Roadmap

v0.1.0 (Current)

  • ✅ Core RAG functionality
  • ✅ ChromaDB vector store
  • ✅ Sentence Transformers embeddings
  • ✅ Fixed-size chunking
  • ✅ Openclaw integration
  • ✅ LangChain integration

v0.2.0

  • Qdrant and FAISS vector stores
  • Semantic chunking
  • Pinecone cloud provider
  • Vision model support

v0.3.0

  • Hybrid search (vector + keyword)
  • Re-ranking with cross-encoders
  • Multi-language support

Documentation

Getting Started

Guides & References

Operations & Deployment

Community & Project

Examples

All examples are in the examples/ directory:

  1. 01_basic_rag.py - Basic RAG workflow
  2. 02_multi_vector_stores.py - Vector store comparison
  3. 03_openclaw_integration.py - Openclaw skill integration
  4. 04_langchain_integration.py - LangChain integration
  5. 05_llm_powered_rag.py - LLM-powered answers
  6. 06_rest_api.py - REST API with FastAPI
  7. 07_docker_containerization.py - Docker setup guide
  8. 08_deployment_patterns.py - Deployment strategies
  9. 09_advanced_rag_patterns.py - Multi-agent RAG, conversation context
  10. 10_streaming_rag.py - Streaming responses
  11. 11_real_time_updates.py - Real-time knowledge base updates

Support

For issues, questions, or suggestions:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

socratic_rag-0.1.3.tar.gz (41.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

socratic_rag-0.1.3-py3-none-any.whl (42.3 kB view details)

Uploaded Python 3

File details

Details for the file socratic_rag-0.1.3.tar.gz.

File metadata

  • Download URL: socratic_rag-0.1.3.tar.gz
  • Upload date:
  • Size: 41.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for socratic_rag-0.1.3.tar.gz
Algorithm Hash digest
SHA256 20bda7c788ca3fa00834cec4336a6c8a263814bffce6f187ad6ee2b2a5c21211
MD5 b6b7cb269e8e7b1e7ac53fddedcf2daa
BLAKE2b-256 89c514de883005051c634348ca100a63b2167a0d6cadb732380b8b3b93a80d4c

See more details on using hashes here.

File details

Details for the file socratic_rag-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: socratic_rag-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 42.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for socratic_rag-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1d71298a4074578c058e96827f71cfc5d2c20f2c9aae4fa2a28899296f0031e5
MD5 51e0d90b9f7b3d2b9b25dae3fd2812c0
BLAKE2b-256 6c79f4cdad19bcf26452807d3a3b3636c7e7da13db42d71d581be61e255d0bbf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page