Skip to main content

RAG-powered document Q&A — Streamlit + OpenAI Agents SDK

Project description

RAQA

Retrieval-Augmented Question-Answering

Retrieval-augmented, pip-installable, CLI-based question answering over arbitrary document collections.

Usage

Installation

pip install raqa

Locally

pip install -e .

Run

BASH via Python interpreter

  1. Build DB

    python cli.py build DATABASE_NAME PATH/TO/FOLDER/WITH/MARKDOWNS

  2. Chat

    python cli.py chat DATABASE_NAME

  3. One-shot retrieval

    python cli.py search DATABASE_NAME "what is retrieval augmented generation?"

  4. Rebuild and chat

    python cli.py rebuild-and-chat DATABASE_NAME PATH/TO/FOLDER/WITH/MARKDOWNS

  5. Get stats

    python cli.py stats

  6. List databases

    python cli.py list

BASH natively

raqa build DATABASE_NAME PATH/TO/FOLDER/WITH/MARKDOWNS
raqa chat DATABASE_NAME
raqa search DATABASE_NAME "what is RAG?"
raqa list (DATABASE_NAME)
raqa stats (DATABASE_NAME)
raqa rebuild-and-chat DATABASE_NAME PATH/TO/FOLDER/WITH/MARKDOWNS

Python

Build database

from db import VectorDB
from config import MARKDOWN_ROOT

db = VectorDB()
db.build(MARKDOWN_ROOT)

Run

from agent import RAGAgent

agent = RAGAgent()
agent.chat()

Build instructions

  1. If any changes are made, update pyproject.toml.
  2. Building the package before uploading: cd raqa; python -m build.
  3. Upload the package to pypi: python -m twine upload --repository {pypi|testpypi} dist/*

Steps 2 and 3 can be done automatically by running make publish.

Related or comparable projects

  1. PyRAG difference: focuses on SingleStore.
  2. ragger-simple difference: uses Qdrant and requires a Qdrant API key, in addition to an LLM API key. RAQA only requires the latter and uses an open-source tech stack for the rest.

Next steps

Real tool-calling (instead of implicit RAG)

Define OpenAI tool:

{
  "name": "search_docs",
  "description": "...",
  "parameters": { "query": "string" }
}

Hybrid search

Combine BM25 (rank-bm25) + embeddings

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raqa-3.0.0.tar.gz (4.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

raqa-3.0.0-py3-none-any.whl (5.6 kB view details)

Uploaded Python 3

File details

Details for the file raqa-3.0.0.tar.gz.

File metadata

  • Download URL: raqa-3.0.0.tar.gz
  • Upload date:
  • Size: 4.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for raqa-3.0.0.tar.gz
Algorithm Hash digest
SHA256 e079bf5baff89c0346402c1168aea9591894b1c9b161f929dca2cf72483a99bc
MD5 65461bf94b4f2bdbaee794f697f4fdef
BLAKE2b-256 09feec5a923b5efb15900fc5cf40a757381f53484c7f7fa91f3be509701b4f36

See more details on using hashes here.

File details

Details for the file raqa-3.0.0-py3-none-any.whl.

File metadata

  • Download URL: raqa-3.0.0-py3-none-any.whl
  • Upload date:
  • Size: 5.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for raqa-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6584d6bea880d449e0eb748445701d2e9e17bd728d5d3e7d8a9e896dfc0ef17b
MD5 388fae9ff10a8962fd48303182f81f26
BLAKE2b-256 eddefa359461ac7a333de79bd4505a65faf162db2e377ae6f61387a1f9feb61e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page