Skip to main content

A dead simple RAG pipeline with persistent memory and conversation namespaces.

Project description

NoBrainer RAG

A dead simple RAG (Retrieval-Augmented Generation) system that just works. Built for developers who want to add memory to their chatbots without overthinking it.

Why NoBrainer?

  • 🚀 Simple API - Just 3 methods: insert, retrieve, delete
  • 🔒 Conversation Isolation - Each conversation gets its own namespace
  • 💾 Persistent Memory - Data survives even if your object doesn't
  • 🎯 Smart Retrieval - Recursive text splitting + built-in reranking for accurate results
  • Fast Setup - Get running in under 5 minutes

Prerequisites

1. Pinecone API Key (Required)

Get your free API key from Pinecone. Create a .env file:

PINECONE_API_KEY=your_key_here

2. Ollama with Embedding Model (Required)

Install Ollama and pull the embedding model:

ollama pull nomic-embed-text:v1.5

Quick Start

from NoBrainerRag import NoBrainerRag

# Create a RAG instance for a conversation
rag = NoBrainerRag(convo_id="user_123")

# Insert some knowledge
rag.insertIntoVectorDB("Paris is the capital of France.")
rag.insertIntoVectorDB("Python is a programming language created by Guido van Rossum.")

# Retrieve relevant info
result = rag.retrieveFromVectorDB("What is the capital of France?")
print(result)

# Delete when done
rag.deleteConvoDB()

The Magic: Persistent Memory

Here's the cool part - your data persists even if the object is gone:

# Session 1: Insert data
rag = NoBrainerRag(convo_id="user_123")
rag.insertIntoVectorDB("Important information here")
del rag  # Object is destroyed

# Session 2: Access the same data later
rag = NoBrainerRag(convo_id="user_123")  # Same ID!
result = rag.retrieveFromVectorDB("tell me about important information")
# Your data is still there! 🎉

This means you can:

  • Restart your app without losing conversation history
  • Share conversation data across different parts of your application
  • Implement true long-term memory for your chatbots

Usage

Insert Text

rag.insertIntoVectorDB("Your text here")
# Returns: "Insertion Successful 3 chunks created"

Retrieve Relevant Content

results = rag.retrieveFromVectorDB("Your query")
# Returns formatted string with the top relevant chunks

Delete Conversation

rag.deleteConvoDB()
# Returns: "Rag Memory of convo with user_123 id was successfully wiped out"

Configuration

Customize the behavior when initializing:

rag = NoBrainerRag(
    convo_id="user_123",
    chunk_size=400,           # Size of each text chunk
    chunk_overlap=75,         # Overlap between chunks
    separators=["\n\n", "\n", ".", ",", " ", ""],  # How to split text
    base_k=10,                # Initial retrieval count
    top_n=4                   # Final results after reranking
)

Parameters Explained

  • convo_id: Unique identifier for the conversation (string or int)
  • chunk_size: How many characters per chunk (default: 400)
  • chunk_overlap: Character overlap between chunks for context (default: 75)
  • separators: Preferred split points, in order of priority (default: paragraphs > lines > sentences > words)
  • base_k: How many chunks to retrieve initially (default: 10)
  • top_n: How many chunks to return after reranking (default: 4)

Under the Hood

NoBrainer RAG uses battle-tested tools so you don't have to:

  • Embeddings: Ollama with nomic-embed-text:v1.5 (768 dimensions)
  • Vector Database: Pinecone (serverless, AWS us-east-1)
  • Chunking: LangChain's RecursiveCharacterTextSplitter
  • Reranking: Flashrank with ms-marco-MiniLM-L-12-v2
  • Retrieval: LangChain's compression retriever with contextual reranking

Common Use Cases

Chatbot with Memory

# When user starts chatting
rag = NoBrainerRag(convo_id=user.id)

# As conversation progresses
rag.insertIntoVectorDB(f"User said: {user_message}")
rag.insertIntoVectorDB(f"Assistant replied: {bot_response}")

# When generating responses
context = rag.retrieveFromVectorDB(user_message)
# Feed context to your LLM

Document Q&A

rag = NoBrainerRag(convo_id="doc_session_456")

# Load your document
with open("document.txt") as f:
    content = f.read()
    rag.insertIntoVectorDB(content)

# Ask questions
answer = rag.retrieveFromVectorDB("What is the main topic?")

Multi-User Application

# Each user gets isolated memory
user1_rag = NoBrainerRag(convo_id=f"user_{user1.id}")
user2_rag = NoBrainerRag(convo_id=f"user_{user2.id}")

# Their data never mixes

FAQ

Q: Do I need to keep the same NoBrainerRag object alive?
A: Nope! As long as you use the same convo_id, you can create new objects anytime and access the same data.

Q: What happens if I use the same convo_id twice?
A: That's the point! Same ID = same memory. It's a feature, not a bug.

Q: Can I use this in production?
A: Yeah, it's built on production-grade tools (Pinecone, LangChain, Ollama). Just make sure your Pinecone plan can handle your scale.

Q: How much does Pinecone cost?
A: They have a generous free tier. Check Pinecone pricing.

Q: Can I change the embedding model?
A: Currently it uses nomic-embed-text:v1.5. Fork it if you need something different.

Q: Is my data secure?
A: Data is stored in your Pinecone account. Use their security features + keep your API keys safe.

Requirements

  • Python 3.8+
  • Pinecone API key
  • Ollama installed locally
  • nomic-embed-text:v1.5 model pulled in Ollama

Contributing

Found a bug? Have an idea? PRs welcome! Keep it simple though - the goal is "no brainer", not "all the features".

License

MIT - do whatever you want with it.

Support

If this saved you hours of work, star the repo ⭐ and help other devs find it!


Built with ❤️ for developers who just want things to work.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nobrainerrag-0.1.0.tar.gz (5.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nobrainerrag-0.1.0-py3-none-any.whl (5.5 kB view details)

Uploaded Python 3

File details

Details for the file nobrainerrag-0.1.0.tar.gz.

File metadata

  • Download URL: nobrainerrag-0.1.0.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.8

File hashes

Hashes for nobrainerrag-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8cb047b7a593504383607979b3ff890380e8d1979c9bc69c86a1656ffaebfb00
MD5 536807db7a08ad87f36e9b9d4701cd6e
BLAKE2b-256 b364d0a4ffe3005d2e531acc363e15d60352ff1b0fa28f6d9eabf81bc3921b6f

See more details on using hashes here.

File details

Details for the file nobrainerrag-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: nobrainerrag-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.8

File hashes

Hashes for nobrainerrag-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e15ef220e49ac368538714e26bacffb46b935f6ec0a5b7735845e5bb1e7669cc
MD5 e53f8a8783102b7b5dd35d8cda2f3ac2
BLAKE2b-256 416aee3ec6c605ea66a68414da2ff8175d58354e3ba155a8992bad9d3ed1145c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page