Skip to main content

A open-source tool to to visualise your RAG documents 🔮.

Project description

RAGxplorer 🦙🦺

RAGxplorer is an interactive streamlit tool to support the building of Retrieval Augmented Generation (RAG) applications by visualizing document chunks and the queries in the embedding space.

[!NOTE] This is an experimental re-factored version.

Demo 🔎

Streamlit App

⚠️ Due to infra limitations, this freely hosted demo may occasionally go down. The best experience is to clone this repo, and run it locally.

Features ✨

  • Document Upload: Users can upload PDF documents.
  • Chunk Configuration: Options to configure the chunk size and overlap
  • Choice of embedding model: all-MiniLM-L6-v2 or text-embedding-ada-002
  • Vector Database Creation: Builds a vector database using Chroma
  • Query Expansion: Generates sub-questions and hypothetical answers to enhance the retrieval process.
  • Interactive Visualization: Utilizes Plotly to visualise the chunks.

Local Installation ⚙️

To run RAGxplorer, ensure you have Python installed, and then install the necessary dependencies:

pip install -r requirements-local-deployment.txt

[!TIP] ⚠️ Do not use requirements.txt. That is so the free streamlit deployment can run. That file includes an additional pysqlite3-binary dependency.

⚠️ If it helps with troubleshooting, this application was built using Python 3.11

Usage 🏎️

  1. Setup OPENAI_API_KEY (required) and ANYSCALE_API_KEY (if you need anyscale). Copy the .streamlit/secrets.example.toml file to .streamlit/secrets.toml and fill in the values.
  2. To start the application, run:
    streamlit run app.py
    
  3. You may need to comment out/remove line 5-7 in app.py.
    __import__('pysqlite3')
    import sys
    sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')
    

[!NOTE] This repo is currently linked to the streamlit demo, and these lines were added due to the runtime in the free streamlit deployment env. See here.

Contributing 👋

Contributions to RAGxplorer are welcome. Please read our contributing guidelines (WIP) for details.

License 👀

This project is licensed under the MIT license - see the LICENSE file for details.

Acknowledgments 💙

  • DeepLearning.AI and Chroma for the inspiration and code labs in their Advanced Retrival course.
  • The Streamlit community for the support and resources.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragxplorer-0.1.0.tar.gz (11.1 kB view hashes)

Uploaded Source

Built Distribution

ragxplorer-0.1.0-py3-none-any.whl (11.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page