Skip to main content

ChatBot with Retrieval Augmented Generation

Project description

ChatBot-RAG

A powerful chatbot implementation using Retrieval Augmented Generation (RAG) to provide context-aware responses based on your data.

Features

  • 🔍 Retrieval Augmented Generation: Enhances LLM responses with relevant context from your data
  • 🧠 Ollama Support: Run models locally with Ollama for privacy and customization
  • 🔗 LangChain Integration: Built on the powerful LangChain framework for advanced chains and pipelines

Installation

pip install chatbot-rag

Requirements

  • Python 3.12
  • Ollama (for local model hosting)

Use

Quick Start

from chatbot_rag.chat import Chatbot 
from chatbot_rag.RAG import RAG

# Use a specific Ollama model
rag = RAG(path="./data/")
rag()
bot = Chatbot(name="llama3")

# Query with specific parameters
question = "Summarize my recent research on climate change"
context  = rag._search_context(question,k=5)
response = bot(context,question)
print(response)

Using temporal paths

from chatbot_rag.chat import Chatbot 
from chatbot_rag.RAG import RAG

# Use a specific Ollama model
with tempfile.TemporaryDirectory() as tmpdirname:
    persistent_dir = os.path.join(tmpdirname, "all_info/")
    os.makedirs(persistent_dir, exist_ok=True)
    rag = RAG(path="./data/",base_persist_path=persistent_dir)
    rag()
    chatbot = Chatbot(name="llama3.1:8b")

    question = "What is the main topic of the document?"
    context = rag._search_context(question)
    answer = chatbot(context=context, question=question)
    print(f"Answer: {answer}")

Using other preprocessing (PyMuPDFPreprocessing)

from chatbot_rag.chat import Chatbot 
from chatbot_rag.RAG import RAG
from src.chatbot_rag.preprocessing import PyMuPDFPreprocessing


kwargs = {"tesseract_path": "C:/Program Files/Tesseract-OCR/tesseract"}
rag = RAG(path="./data/",preprocessing=PyMuPDFPreprocessing,**kwargs)
rag()
chatbot = Chatbot(name="llama3.1:8b")

question = "What is the main topic of the document?"
context = rag._search_context(question)
answer = chatbot(context=context, question=question)
print(f"Question: {answer}")

By default, the system will attempt to extract information from images using Tesseract-OCR, so it must be installed beforehand.
You can refer to the installation instructions at this link.

You can disable image extraction by adding the following to the kwargs:

kwargs = {"extract_images": False}

and passing it directly to the RAG component.

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chatbot_rag-0.1.6.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chatbot_rag-0.1.6-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file chatbot_rag-0.1.6.tar.gz.

File metadata

  • Download URL: chatbot_rag-0.1.6.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for chatbot_rag-0.1.6.tar.gz
Algorithm Hash digest
SHA256 61caf0cdd3e86217e0ca62c9ef49d7b0954efa059e39fda8dd20916df5389f6f
MD5 bd2ed0641c646f56640ad4280f15910d
BLAKE2b-256 67443bf8d557495591d2c430f99c7baf13637a3a6e653686d1584626d43d7e17

See more details on using hashes here.

File details

Details for the file chatbot_rag-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: chatbot_rag-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 10.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.0

File hashes

Hashes for chatbot_rag-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 ca86b626fb3731a4cd46edfb06a64f5299a6466750f09356ea9d3a0e89a5d3a5
MD5 b022bae76b2638d7c280a31e65bffaf8
BLAKE2b-256 060588489eb90d81d0d588ac125ee4a5560af6519ba7833f3e10bebf68bd42fb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page