A open-source tool to to visualise your RAG documents 🔮.
Project description
RAGxplorer 🦙🦺
RAGxplorer is an interactive streamlit tool to support the building of Retrieval Augmented Generation (RAG) applications by visualizing document chunks and the queries in the embedding space.
[!NOTE] This is an experimental re-factored version.
Demo 🔎
⚠️ Due to infra limitations, this freely hosted demo may occasionally go down. The best experience is to clone this repo, and run it locally.
Features ✨
- Document Upload: Users can upload PDF documents.
- Chunk Configuration: Options to configure the chunk size and overlap
- Choice of embedding model:
all-MiniLM-L6-v2
ortext-embedding-ada-002
- Vector Database Creation: Builds a vector database using Chroma
- Query Expansion: Generates sub-questions and hypothetical answers to enhance the retrieval process.
- Interactive Visualization: Utilizes Plotly to visualise the chunks.
Local Installation ⚙️
To run RAGxplorer, ensure you have Python installed, and then install the necessary dependencies:
pip install -r requirements-local-deployment.txt
[!TIP] ⚠️ Do not use
requirements.txt
. That is so the free streamlit deployment can run. That file includes an additionalpysqlite3-binary
dependency.⚠️ If it helps with troubleshooting, this application was built using Python 3.11
Usage 🏎️
- Setup
OPENAI_API_KEY
(required) andANYSCALE_API_KEY
(if you need anyscale). Copy the.streamlit/secrets.example.toml
file to.streamlit/secrets.toml
and fill in the values. - To start the application, run:
streamlit run app.py
- You may need to comment out/remove line 5-7 in
app.py
.__import__('pysqlite3') import sys sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')
[!NOTE] This repo is currently linked to the streamlit demo, and these lines were added due to the runtime in the free streamlit deployment env. See here.
Contributing 👋
Contributions to RAGxplorer are welcome. Please read our contributing guidelines (WIP) for details.
License 👀
This project is licensed under the MIT license - see the LICENSE file for details.
Acknowledgments 💙
- DeepLearning.AI and Chroma for the inspiration and code labs in their Advanced Retrival course.
- The Streamlit community for the support and resources.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ragxplorer-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5b0cbc4aa144994854bc06d398519edbeb03f9ce5946860ee37330dbd78776db |
|
MD5 | 2583971fa02bacc6e0e136c71ec99002 |
|
BLAKE2b-256 | d47fe4f66803518cfb3d1abf9a7181f574923e279da348d718ef626fe105d43a |