Skip to main content

RAG-powered document Q&A — Streamlit + OpenAI Agents SDK

Project description

RAQA

Retrieval-Augmented Question-Answering

Retrieval-augmented, pip-installable, CLI-based question answering over arbitrary document collections.

Usage

Installation

pip install raqa

Locally

pip install -e .

Run

BASH via Python interpreter

  1. Build DB

    python cli.py build DATABASE_NAME PATH/TO/FOLDER/WITH/MARKDOWNS

  2. Chat

    python cli.py chat DATABASE_NAME

  3. One-shot retrieval

    python cli.py search DATABASE_NAME "what is retrieval augmented generation?"

  4. Rebuild and chat

    python cli.py rebuild-and-chat DATABASE_NAME PATH/TO/FOLDER/WITH/MARKDOWNS

  5. Get stats

    python cli.py stats

  6. List databases

    python cli.py list

BASH natively

raqa build DATABASE_NAME PATH/TO/FOLDER/WITH/MARKDOWNS
raqa chat DATABASE_NAME
raqa search DATABASE_NAME "what is RAG?"
raqa list (DATABASE_NAME)
raqa stats (DATABASE_NAME)
raqa rebuild-and-chat DATABASE_NAME PATH/TO/FOLDER/WITH/MARKDOWNS

Python

Build database

from db import VectorDB
from config import MARKDOWN_ROOT

db = VectorDB()
db.build(MARKDOWN_ROOT)

Run

from agent import RAGAgent

agent = RAGAgent()
agent.chat()

Build instructions

  1. If any changes are made, update pyproject.toml.
  2. Building the package before uploading: cd raqa; python -m build.
  3. Upload the package to pypi: python -m twine upload --repository {pypi|testpypi} dist/*

Steps 2 and 3 can be done automatically by running make publish.

Related or comparable projects

  1. PyRAG difference: focuses on SingleStore.
  2. ragger-simple difference: uses Qdrant and requires a Qdrant API key, in addition to an LLM API key. RAQA only requires the latter and uses an open-source tech stack for the rest.

Next steps

Real tool-calling (instead of implicit RAG)

Define OpenAI tool:

{
  "name": "search_docs",
  "description": "...",
  "parameters": { "query": "string" }
}

Hybrid search

Combine BM25 (rank-bm25) + embeddings

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raqa-3.0.1.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

raqa-3.0.1-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file raqa-3.0.1.tar.gz.

File metadata

  • Download URL: raqa-3.0.1.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for raqa-3.0.1.tar.gz
Algorithm Hash digest
SHA256 2fe49284259241596f8ac9981d9ef1740edd8cd7896a5832f68c94540d35bffa
MD5 74c0e3143db5d6d03916f88014cc5c44
BLAKE2b-256 b09c6f42950f608af5dc22073fe24e3c09ada10ec21df091ac9c037350d8048d

See more details on using hashes here.

File details

Details for the file raqa-3.0.1-py3-none-any.whl.

File metadata

  • Download URL: raqa-3.0.1-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for raqa-3.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 dc8f41559509aeca6a3d4c9c1f4e102e84400a80ad9f58b061408f2753709630
MD5 d7f8fe6d32fe801e93ade1d5bb0dce08
BLAKE2b-256 b3b70b9629ca7ca5528f2d7e05cb31c3ff2bdb694168a067878570117c0dd71d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page