Skip to main content

Production-grade Retrieval-Augmented Generation package

Project description

Socratic RAG

Production-grade Retrieval-Augmented Generation (RAG) package for Python.

Features

  • Multiple Vector Databases: ChromaDB, Qdrant, FAISS, Pinecone (with extensible provider pattern)
  • Flexible Embedding Providers: Sentence Transformers, OpenAI (via Socrates Nexus)
  • Smart Chunking: Fixed-size, semantic, and recursive chunking strategies
  • Document Processing: Text, PDF, Markdown, and code files
  • Framework Integrations: Openclaw skills and LangChain components
  • Async Support: Full async/await interface for non-blocking operations
  • Production Ready: 70%+ test coverage, type hints, comprehensive documentation

Installation

Basic Installation

pip install socratic-rag

With Optional Dependencies

# All features
pip install socratic-rag[all]

# Specific vector stores
pip install socratic-rag[chromadb,qdrant,faiss]

# Document processing
pip install socratic-rag[pdf,markdown]

# Integrations
pip install socratic-rag[langchain,openclaw,nexus]

# Development
pip install socratic-rag[dev]

Quick Start

Basic Usage

from socratic_rag import RAGClient

# Initialize client
client = RAGClient()

# Add documents
doc_id = client.add_document(
    content="Python is a programming language created by Guido van Rossum.",
    source="python_facts.txt"
)

# Search
results = client.search("What is Python?", top_k=5)
for result in results:
    print(f"Score: {result.score:.2f}")
    print(f"Text: {result.chunk.text}\n")

# Retrieve formatted context for LLM
context = client.retrieve_context("What is Python?")
print(context)

Custom Configuration

from socratic_rag import RAGClient, RAGConfig

config = RAGConfig(
    vector_store="chromadb",
    embedder="sentence-transformers",
    chunking_strategy="fixed",
    chunk_size=512,
    chunk_overlap=50,
    top_k=5,
)

client = RAGClient(config)

Async Usage

import asyncio
from socratic_rag import AsyncRAGClient

async def main():
    client = AsyncRAGClient()

    # Add documents asynchronously
    doc_id = await client.add_document(
        content="Document content",
        source="source.txt"
    )

    # Search asynchronously
    results = await client.search("query")

    # Retrieve context asynchronously
    context = await client.retrieve_context("query")

asyncio.run(main())

Architecture

Provider Pattern

Socratic RAG uses an extensible provider pattern for easy integration of new components:

RAGClient
├── Embedder Provider (sentence-transformers, OpenAI, etc.)
├── Chunker Provider (fixed, semantic, recursive)
└── Vector Store Provider (ChromaDB, Qdrant, FAISS, Pinecone)

Core Models

  • Document: Raw document with metadata
  • Chunk: Text chunk with position and metadata
  • SearchResult: Search result with relevance score
  • RAGConfig: Configuration object

Examples

See the examples/ directory for:

  1. 01_basic_rag.py - Basic RAG workflow
  2. 02_qdrant_rag.py - Using Qdrant vector store
  3. 03_faiss_rag.py - Using FAISS vector store
  4. 04_document_processing.py - Document processing
  5. 05_openclaw_integration.py - Openclaw skill integration
  6. 06_langchain_integration.py - LangChain integration
  7. 07_llm_powered_rag.py - RAG with LLM (using Socrates Nexus)

Testing

Run the test suite:

pytest tests/ -v --cov=socratic_rag

Specific test categories:

# Unit tests only
pytest tests/ -m unit

# Integration tests only
pytest tests/ -m integration

# Exclude slow tests
pytest tests/ -m "not slow"

Vector Store Providers

ChromaDB (Default)

from socratic_rag import RAGClient, RAGConfig

config = RAGConfig(vector_store="chromadb")
client = RAGClient(config)

Qdrant

config = RAGConfig(vector_store="qdrant")
client = RAGClient(config)

FAISS

config = RAGConfig(vector_store="faiss")
client = RAGClient(config)

Embedding Providers

Sentence Transformers (Default)

config = RAGConfig(embedder="sentence-transformers")
client = RAGClient(config)

OpenAI (via Socrates Nexus)

config = RAGConfig(embedder="openai")
client = RAGClient(config)

Document Processing

Text Files

from socratic_rag.processors import TextProcessor

processor = TextProcessor()
documents = processor.process("path/to/file.txt")

PDF Files

from socratic_rag.processors import PDFProcessor

processor = PDFProcessor()
documents = processor.process("path/to/file.pdf")

Markdown Files

from socratic_rag.processors import MarkdownProcessor

processor = MarkdownProcessor()
documents = processor.process("path/to/file.md")

Framework Integrations

Openclaw

from socratic_rag.integrations.openclaw import SocraticRAGSkill

skill = SocraticRAGSkill(vector_store="chromadb")
skill.add_document("content", "source.txt")
results = skill.search("query")

LangChain

from socratic_rag.integrations.langchain import SocraticRAGRetriever
from langchain.chat_models import ChatAnthropic
from langchain.chains import RetrievalQA

retriever = SocraticRAGRetriever(client=rag_client, top_k=5)
llm = ChatAnthropic(model="claude-sonnet")
qa = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)

answer = qa.run("What is Python?")

API Reference

RAGClient

Methods

  • add_document(content, source, metadata=None) -> str - Add document to knowledge base
  • search(query, top_k=None, filters=None) -> List[SearchResult] - Search for relevant documents
  • retrieve_context(query, top_k=None) -> str - Retrieve formatted context for LLM
  • clear() -> bool - Clear all documents

AsyncRAGClient

Same methods as RAGClient but with async/await.

Configuration Options

RAGConfig(
    vector_store="chromadb",           # Vector store provider
    embedder="sentence-transformers",  # Embedding provider
    chunking_strategy="fixed",         # Chunking strategy
    chunk_size=512,                    # Characters per chunk
    chunk_overlap=50,                  # Overlap between chunks
    top_k=5,                           # Default number of results
    embedding_cache=True,              # Cache embeddings
    cache_ttl=3600,                    # Cache TTL in seconds
    collection_name="socratic_rag",    # Collection name
)

Exceptions

  • SocraticRAGError - Base exception
  • ConfigurationError - Configuration validation error
  • VectorStoreError - Vector store operation error
  • EmbeddingError - Embedding operation error
  • ChunkingError - Chunking operation error
  • ProcessorError - Document processing error
  • DocumentNotFoundError - Document not found
  • ProviderNotFoundError - Provider not found

Performance Tips

  1. Use appropriate chunk size: Smaller chunks (256-512) for dense retrieval, larger (1024+) for sparse
  2. Set overlap wisely: 10-15% overlap usually works well
  3. Cache embeddings: Enable embedding cache for repeated queries
  4. Batch operations: Use embed_batch() for multiple embeddings
  5. Choose right embedder: Sentence Transformers for local, OpenAI for production scale

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

MIT License - see LICENSE file for details

Citation

If you use Socratic RAG in your research, please cite:

@software{socratic_rag_2024,
  title={Socratic RAG: Production-grade Retrieval-Augmented Generation},
  author={Your Name},
  year={2024},
  url={https://github.com/Nireus79/Socratic-rag}
}

Roadmap

v0.1.0 (Current)

  • ✅ Core RAG functionality
  • ✅ ChromaDB vector store
  • ✅ Sentence Transformers embeddings
  • ✅ Fixed-size chunking
  • ✅ Openclaw integration
  • ✅ LangChain integration

v0.2.0

  • Qdrant and FAISS vector stores
  • Semantic chunking
  • Pinecone cloud provider
  • Vision model support

v0.3.0

  • Hybrid search (vector + keyword)
  • Re-ranking with cross-encoders
  • Multi-language support

Support

For issues, questions, or suggestions:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

socratic_rag-0.1.0.tar.gz (21.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

socratic_rag-0.1.0-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file socratic_rag-0.1.0.tar.gz.

File metadata

  • Download URL: socratic_rag-0.1.0.tar.gz
  • Upload date:
  • Size: 21.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for socratic_rag-0.1.0.tar.gz
Algorithm Hash digest
SHA256 48c844253781e85d76b6b8a0351ae21006d5a8d3b4e43782c4934c0bb1240b60
MD5 7eeda9d6133971f2f0acfedd077f14d4
BLAKE2b-256 f463753a59ef1b93e79d8c2c89a03fe4852c6dd748099e43ae07d8b0bd83bf8e

See more details on using hashes here.

File details

Details for the file socratic_rag-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: socratic_rag-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for socratic_rag-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e1191a544a70f9d4d0b3aafe453184670c317573e891385d70c784fcd11e14f6
MD5 c4673003e8e620bb24f603aa6d32e9d6
BLAKE2b-256 cdfba90371be240d720c08cf37e96766fc75d0d4d1f09411787a1d6f0d84bffb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page