Skip to main content

ChatBot with Retrieval Augmented Generation

Project description

ChatBot-RAG

A powerful chatbot implementation using Retrieval Augmented Generation (RAG) to provide context-aware responses based on your data.

Features

  • 🔍 Retrieval Augmented Generation: Enhances LLM responses with relevant context from your data
  • 🧠 Ollama Support: Run models locally with Ollama for privacy and customization
  • 🔗 LangChain Integration: Built on the powerful LangChain framework for advanced chains and pipelines

Installation

pip install chatbot-rag

Requirements

  • Python 3.12
  • Ollama (for local model hosting)

Use

Quick Start

from chatbot_rag.chat import Chatbot 
from chatbot_rag.RAG import RAG

# Use a specific Ollama model
rag = RAG(path="./data/")
rag()
bot = Chatbot(name="deepseek-r1:8b")

# Query with specific parameters
question = "Summarize my recent research on climate change"
context  = rag._search_context(question,k=5)
response = bot(context,question)
print(response)

Using temporal paths

from chatbot_rag.chat import Chatbot 
from chatbot_rag.RAG import RAG


with tempfile.TemporaryDirectory() as tmpdirname:
    persistent_dir = os.path.join(tmpdirname, "all_info/")
    os.makedirs(persistent_dir, exist_ok=True)
    rag = RAG(path="./data/",base_persist_path=persistent_dir)
    rag()
    chatbot = Chatbot(name="llama3.1:8b")

    question = "What is the main topic of the document?"
    context = rag._search_context(question)
    answer = chatbot(context=context, question=question)
    print(f"Answer: {answer}")

Using other preprocessing (PyMuPDFPreprocessing)

from chatbot_rag.chat import Chatbot 
from chatbot_rag.RAG import RAG
from src.chatbot_rag.preprocessing import PyMuPDFPreprocessing


kwargs = {"tesseract_path": "C:/Program Files/Tesseract-OCR/tesseract"}
rag = RAG(path="./data/",preprocessing=PyMuPDFPreprocessing,**kwargs)
rag()
chatbot = Chatbot(name="llama3.1:8b")

question = "What is the main topic of the document?"
context = rag._search_context(question)
answer = chatbot(context=context, question=question)
print(f"Question: {answer}")

By default, the system will attempt to extract information from images using Tesseract-OCR, so it must be installed beforehand.
You can refer to the installation instructions at this link.

You can disable image extraction by adding the following to the kwargs:

kwargs = {"extract_images": False}

and passing it directly to the RAG component.

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chatbot_rag-0.1.7.tar.gz (11.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chatbot_rag-0.1.7-py3-none-any.whl (11.0 kB view details)

Uploaded Python 3

File details

Details for the file chatbot_rag-0.1.7.tar.gz.

File metadata

  • Download URL: chatbot_rag-0.1.7.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for chatbot_rag-0.1.7.tar.gz
Algorithm Hash digest
SHA256 04bc9acb715dbf14697caae16db2f2e8307d1cea5d5dc83b0a29ed09767ef8c5
MD5 d20914d1b4f641be645d8357e368d5a1
BLAKE2b-256 8a8e3abf54b0a77b01edc20b187788ff6e6aad2e7337232a7d913783aa0b516b

See more details on using hashes here.

File details

Details for the file chatbot_rag-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: chatbot_rag-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 11.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for chatbot_rag-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 a27de0cc4ce7427a36a6713237028a0fffa55367f210dbba522b5c14ee45a973
MD5 5c23f55f2f6f13992233990f0b4fd0b4
BLAKE2b-256 e7c2dea846bd48a417abcf728d848075f5cc67cfbea95821facc0845209c9b8f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page