Boosting Retrieval-Augmented Generation with Context Reordering
Project description
RAGBoost
Getting Started
Install from Source
Python >=3.10
git clone https://github.com/SecretSettler/RAGBoost.git
cd RAGBoost
pip install -e .
Using Docker
Docker Image
docker pull seanjiang01/prompt-planner:v0.0.1
docker run -d --gpus all --name prompt-planner-container prompt-planner
docker exec -it prompt-planner-container bash
Build from scatch
git clone https://github.com/SecretSettler/RAGBoost.git
cd RAGBoost
docker build -t ragboost .
docker run -d --gpus all --name ragboost-container ragboost
docker exec -it ragboost-container bash
Note: This is slow due to building FlashAttention from scatch.
Quick usage
Offline
Build dataset and index from scratch
1. BM25, MultihopRAG
RECOMMENDED: Please refer to examples/construct_rag_data/multihopRAG_bm25.py
Quick running command:
docker run -d \
--name elasticsearch \
-p 9200:9200 \
-p 9300:9300 \
-e "discovery.type=single-node" \
-e "xpack.security.enabled=false" \
-e "xpack.security.http.ssl.enabled=false" \
-e "xpack.security.transport.ssl.enabled=false" \
-e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \
docker.elastic.co/elasticsearch/elasticsearch:8.18.2
python examples/construct_rag_data/multihopRAG_bm25.py
2. Faiss, MultihopRAG
RECOMMENDED: Please refer to examples/construct_rag_data/multihopRAG_faiss.py
Quick running command:
python -m sglang.launch_server \
--model-path Alibaba-NLP/gte-Qwen2-7B-instruct \
--is-embedding \
--host 0.0.0.0 \
--port 30000
python examples/construct_rag_data/multihopRAG_faiss.py
Generating and selecting plan
RECOMMENDED: Please refer to examples/planner/generate_plan.py
Quick running command:
python examples/planner/generate_plan.py --prompts_path <PATH-TO-YOUR RETRIEVAL-OUTPUT> --output_path <YOUR-PLAN-SAVE-PATH>
Launch inference
RECOMMENDED: Please refer to examples/planner/sglang_inference.py
Quick running command:
python -m sglang.launch_server --model-path Qwen/Qwen3-32B --port 30000 --tp-size 4 --reasoning-parser qwen3 --enable-metrics --schedule-policy lpm
python examples/planner/sglang_inference.py --model Qwen/Qwen3-32B --plan_path <YOUR-PLAN-SAVE-PATH> --corpus_path <PATH-TO-YOUR-CORPUS-WITH-CTX-LENGTH>
Data Format:
If you have your own data, please format to the example below. Currently we only support data with jsonl format. Each json should at least contain these attributes:
{
"qid": 0,
"text": "Is the sky blue?",
"answer": ["Yes", "Yes the sky is blue"],
"top_k_doc_id": [2, 8, 1, 10]
}
This should be under the --prompts_path for plan generation and selection.
Roadmap
- Implement the Group Aware RR scheduler
- Support online inference
- Implement a faster prefix cache
- Support multi-modality models
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ragboost-0.0.1.tar.gz.
File metadata
- Download URL: ragboost-0.0.1.tar.gz
- Upload date:
- Size: 3.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0a3cd8eaecb86877a8eaba8b5296e27cce32fcd2e2b661ab168ce3a4522bad96
|
|
| MD5 |
badc57e6354a9ebaa7e076555e425648
|
|
| BLAKE2b-256 |
ef5a6e986919f248792bcd68432b52fc5f7831ec019e839a9638155027450659
|
File details
Details for the file ragboost-0.0.1-py3-none-any.whl.
File metadata
- Download URL: ragboost-0.0.1-py3-none-any.whl
- Upload date:
- Size: 2.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
239d012774161659e186bf2c4f2c4e0af93fa38c272998e063a79854c6ea3992
|
|
| MD5 |
222f4e3ed774234c6d1bedcd7d88f4b5
|
|
| BLAKE2b-256 |
f0f47f0f81ef78f947f57f76c8664cf656464e544a75bbe074f2eef89dc71d8a
|