Skip to main content

Production-grade Retrieval-Augmented Generation package

Project description

Socratic RAG

PyPI version License: MIT Python 3.8+ Test Coverage

Why Socratic RAG?

Building reliable retrieval systems is complex. Socratic RAG provides production-grade RAG:

  • Multiple Vector Stores - ChromaDB, Qdrant, FAISS, Pinecone with extensible provider pattern
  • Flexible Embeddings - Sentence Transformers or OpenAI embeddings via Socrates Nexus
  • Smart Chunking - Fixed-size, semantic, and recursive strategies for optimal chunk processing
  • Document Processing - Handle Text, PDF, Markdown, and code files out of the box
  • Async Support - Full async/await interface for non-blocking operations at scale

Production-grade Retrieval-Augmented Generation (RAG) package for Python.

Features

  • Multiple Vector Databases: ChromaDB, Qdrant, FAISS, Pinecone (with extensible provider pattern)
  • Flexible Embedding Providers: Sentence Transformers, OpenAI (via Socrates Nexus)
  • Smart Chunking: Fixed-size, semantic, and recursive chunking strategies
  • Document Processing: Text, PDF, Markdown, and code files
  • Framework Integrations: Openclaw skills and LangChain components
  • Async Support: Full async/await interface for non-blocking operations
  • Production Ready: 70%+ test coverage, type hints, comprehensive documentation
  • Part of Socrates Ecosystem: Built on Socrates Nexus for LLM integration

Part of the Socrates Ecosystem

Socratic RAG is a core component of the Socrates Ecosystem - a collection of production-grade AI packages that work together.

How It Uses Socrates Nexus

  • Embedded document generation uses Socrates Nexus for embeddings
  • Answer generation uses Socrates Nexus for LLM calls (multi-provider support)
  • Works with any Socrates Nexus provider (Claude, GPT-4, Gemini, Ollama)

Related Packages in the Ecosystem

👉 Full ecosystem guide: See Socrates Nexus ECOSYSTEM.md

📊 Track development: View the Socrates Ecosystem Roadmap to see progress across all packages

Installation

Basic Installation

pip install socratic-rag

With Optional Dependencies

# All features
pip install socratic-rag[all]

# Specific vector stores
pip install socratic-rag[chromadb,qdrant,faiss]

# Document processing
pip install socratic-rag[pdf,markdown]

# Integrations
pip install socratic-rag[langchain,openclaw,nexus]

# Development
pip install socratic-rag[dev]

Quick Start

Basic Usage

from socratic_rag import RAGClient

# Initialize client
client = RAGClient()

# Add documents
doc_id = client.add_document(
    content="Python is a programming language created by Guido van Rossum.",
    source="python_facts.txt"
)

# Search
results = client.search("What is Python?", top_k=5)
for result in results:
    print(f"Score: {result.score:.2f}")
    print(f"Text: {result.chunk.text}\n")

# Retrieve formatted context for LLM
context = client.retrieve_context("What is Python?")
print(context)

Custom Configuration

from socratic_rag import RAGClient, RAGConfig

config = RAGConfig(
    vector_store="chromadb",
    embedder="sentence-transformers",
    chunking_strategy="fixed",
    chunk_size=512,
    chunk_overlap=50,
    top_k=5,
)

client = RAGClient(config)

Async Usage

import asyncio
from socratic_rag import AsyncRAGClient

async def main():
    client = AsyncRAGClient()

    # Add documents asynchronously
    doc_id = await client.add_document(
        content="Document content",
        source="source.txt"
    )

    # Search asynchronously
    results = await client.search("query")

    # Retrieve context asynchronously
    context = await client.retrieve_context("query")

asyncio.run(main())

Architecture

Provider Pattern

Socratic RAG uses an extensible provider pattern for easy integration of new components:

RAGClient
├── Embedder Provider (sentence-transformers, OpenAI, etc.)
├── Chunker Provider (fixed, semantic, recursive)
└── Vector Store Provider (ChromaDB, Qdrant, FAISS, Pinecone)

Core Models

  • Document: Raw document with metadata
  • Chunk: Text chunk with position and metadata
  • SearchResult: Search result with relevance score
  • RAGConfig: Configuration object

Support Development

If you find this package useful, consider supporting development:

Your support helps fund development of the entire Socratic ecosystem.

Examples

See the examples/ directory for:

  1. 01_basic_rag.py - Basic RAG workflow
  2. 02_qdrant_rag.py - Using Qdrant vector store
  3. 03_faiss_rag.py - Using FAISS vector store
  4. 04_document_processing.py - Document processing
  5. 05_openclaw_integration.py - Openclaw skill integration
  6. 06_langchain_integration.py - LangChain integration
  7. 07_llm_powered_rag.py - RAG with LLM (using Socrates Nexus)

Testing

Run the test suite:

pytest tests/ -v --cov=socratic_rag

Specific test categories:

# Unit tests only
pytest tests/ -m unit

# Integration tests only
pytest tests/ -m integration

# Exclude slow tests
pytest tests/ -m "not slow"

Vector Store Providers

ChromaDB (Default)

from socratic_rag import RAGClient, RAGConfig

config = RAGConfig(vector_store="chromadb")
client = RAGClient(config)

Qdrant

config = RAGConfig(vector_store="qdrant")
client = RAGClient(config)

FAISS

config = RAGConfig(vector_store="faiss")
client = RAGClient(config)

Embedding Providers

Sentence Transformers (Default)

config = RAGConfig(embedder="sentence-transformers")
client = RAGClient(config)

OpenAI (via Socrates Nexus)

config = RAGConfig(embedder="openai")
client = RAGClient(config)

Document Processing

Text Files

from socratic_rag.processors import TextProcessor

processor = TextProcessor()
documents = processor.process("path/to/file.txt")

PDF Files

from socratic_rag.processors import PDFProcessor

processor = PDFProcessor()
documents = processor.process("path/to/file.pdf")

Markdown Files

from socratic_rag.processors import MarkdownProcessor

processor = MarkdownProcessor()
documents = processor.process("path/to/file.md")

Framework Integrations

Openclaw

from socratic_rag.integrations.openclaw import SocraticRAGSkill

skill = SocraticRAGSkill(vector_store="chromadb")
skill.add_document("content", "source.txt")
results = skill.search("query")

LangChain

from socratic_rag.integrations.langchain import SocraticRAGRetriever
from langchain.chat_models import ChatAnthropic
from langchain.chains import RetrievalQA

retriever = SocraticRAGRetriever(client=rag_client, top_k=5)
llm = ChatAnthropic(model="claude-sonnet")
qa = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)

answer = qa.run("What is Python?")

Documentation

  • Caching Strategy Guide - Complete guide to performance optimization with embedding caching and search result caching, including configuration, memory management, and best practices

API Reference

RAGClient

Methods

  • add_document(content, source, metadata=None) -> str - Add document to knowledge base
  • search(query, top_k=None, filters=None) -> List[SearchResult] - Search for relevant documents
  • retrieve_context(query, top_k=None) -> str - Retrieve formatted context for LLM
  • clear() -> bool - Clear all documents

AsyncRAGClient

Same methods as RAGClient but with async/await.

Configuration Options

RAGConfig(
    vector_store="chromadb",           # Vector store provider
    embedder="sentence-transformers",  # Embedding provider
    chunking_strategy="fixed",         # Chunking strategy
    chunk_size=512,                    # Characters per chunk
    chunk_overlap=50,                  # Overlap between chunks
    top_k=5,                           # Default number of results
    embedding_cache=True,              # Cache embeddings
    cache_ttl=3600,                    # Cache TTL in seconds
    collection_name="socratic_rag",    # Collection name
)

Exceptions

  • SocraticRAGError - Base exception
  • ConfigurationError - Configuration validation error
  • VectorStoreError - Vector store operation error
  • EmbeddingError - Embedding operation error
  • ChunkingError - Chunking operation error
  • ProcessorError - Document processing error
  • DocumentNotFoundError - Document not found
  • ProviderNotFoundError - Provider not found

Performance Tips

  1. Use appropriate chunk size: Smaller chunks (256-512) for dense retrieval, larger (1024+) for sparse
  2. Set overlap wisely: 10-15% overlap usually works well
  3. Cache embeddings: Enable embedding cache for repeated queries
  4. Batch operations: Use embed_batch() for multiple embeddings
  5. Choose right embedder: Sentence Transformers for local, OpenAI for production scale

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

MIT License - see LICENSE file for details

Citation

If you use Socratic RAG in your research, please cite:

@software{socratic_rag_2024,
  title={Socratic RAG: Production-grade Retrieval-Augmented Generation},
  author={Your Name},
  year={2024},
  url={https://github.com/Nireus79/Socratic-rag}
}

Roadmap

v0.1.0 (Current)

  • ✅ Core RAG functionality
  • ✅ ChromaDB vector store
  • ✅ Sentence Transformers embeddings
  • ✅ Fixed-size chunking
  • ✅ Openclaw integration
  • ✅ LangChain integration

v0.2.0

  • Qdrant and FAISS vector stores
  • Semantic chunking
  • Pinecone cloud provider
  • Vision model support

v0.3.0

  • Hybrid search (vector + keyword)
  • Re-ranking with cross-encoders
  • Multi-language support

Documentation

Getting Started

Guides & References

Operations & Deployment

Community & Project

Examples

All examples are in the examples/ directory:

  1. 01_basic_rag.py - Basic RAG workflow
  2. 02_multi_vector_stores.py - Vector store comparison
  3. 03_openclaw_integration.py - Openclaw skill integration
  4. 04_langchain_integration.py - LangChain integration
  5. 05_llm_powered_rag.py - LLM-powered answers
  6. 06_rest_api.py - REST API with FastAPI
  7. 07_docker_containerization.py - Docker setup guide
  8. 08_deployment_patterns.py - Deployment strategies
  9. 09_advanced_rag_patterns.py - Multi-agent RAG, conversation context
  10. 10_streaming_rag.py - Streaming responses
  11. 11_real_time_updates.py - Real-time knowledge base updates

Support

For issues, questions, or suggestions:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

socratic_rag-0.1.2.tar.gz (35.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

socratic_rag-0.1.2-py3-none-any.whl (35.5 kB view details)

Uploaded Python 3

File details

Details for the file socratic_rag-0.1.2.tar.gz.

File metadata

  • Download URL: socratic_rag-0.1.2.tar.gz
  • Upload date:
  • Size: 35.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for socratic_rag-0.1.2.tar.gz
Algorithm Hash digest
SHA256 6963c8526c17e0fce0a51e57709ac61c25ccb1cb4ab83402e319a1bc421452b7
MD5 843048174b47b41e958d9f66ec76fa91
BLAKE2b-256 835e78a892144c5db39d543c8aef54b66c14bb5aafdfec1f4d93803085cceb86

See more details on using hashes here.

File details

Details for the file socratic_rag-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: socratic_rag-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 35.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for socratic_rag-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e9ea17879ce4702e565a1fd6ae508b0559cb9aa2a02f934512a96d2defae817f
MD5 e3bf7d62f4e72efea7c825a61732947e
BLAKE2b-256 c4ef870c6e8c8bc12206949ee86b912e508b62e72efd9cd3438139e69e60e87c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page