Skip to main content

Boosting Retrieval-Augmented Generation with Context Reordering

Project description

RAGBoost

Getting Started

Install from Source

Python >=3.10

git clone https://github.com/SecretSettler/RAGBoost.git
cd RAGBoost
pip install -e .

Using Docker

Docker Image

docker pull seanjiang01/prompt-planner:v0.0.1
docker run -d --gpus all --name prompt-planner-container prompt-planner
docker exec -it prompt-planner-container bash

Build from scatch

git clone https://github.com/SecretSettler/RAGBoost.git
cd RAGBoost
docker build -t ragboost .
docker run -d --gpus all --name ragboost-container ragboost
docker exec -it ragboost-container bash

Note: This is slow due to building FlashAttention from scatch.

Quick usage

Offline

Build dataset and index from scratch


1. BM25, MultihopRAG

RECOMMENDED: Please refer to examples/construct_rag_data/multihopRAG_bm25.py

Quick running command:

docker run -d \
  --name elasticsearch \
  -p 9200:9200 \
  -p 9300:9300 \
  -e "discovery.type=single-node" \
  -e "xpack.security.enabled=false" \
  -e "xpack.security.http.ssl.enabled=false" \
  -e "xpack.security.transport.ssl.enabled=false" \
  -e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \
  docker.elastic.co/elasticsearch/elasticsearch:8.18.2

python examples/construct_rag_data/multihopRAG_bm25.py

2. Faiss, MultihopRAG

RECOMMENDED: Please refer to examples/construct_rag_data/multihopRAG_faiss.py

Quick running command:

python -m sglang.launch_server \
  --model-path Alibaba-NLP/gte-Qwen2-7B-instruct \
  --is-embedding \
  --host 0.0.0.0 \
  --port 30000

python examples/construct_rag_data/multihopRAG_faiss.py

Generating and selecting plan

RECOMMENDED: Please refer to examples/planner/generate_plan.py

Quick running command:

python examples/planner/generate_plan.py --prompts_path <PATH-TO-YOUR RETRIEVAL-OUTPUT> --output_path <YOUR-PLAN-SAVE-PATH>

Launch inference

RECOMMENDED: Please refer to examples/planner/sglang_inference.py

Quick running command:

python -m sglang.launch_server --model-path Qwen/Qwen3-32B --port 30000 --tp-size 4 --reasoning-parser qwen3 --enable-metrics --schedule-policy lpm

python examples/planner/sglang_inference.py --model Qwen/Qwen3-32B --plan_path <YOUR-PLAN-SAVE-PATH> --corpus_path <PATH-TO-YOUR-CORPUS-WITH-CTX-LENGTH>

Data Format:

If you have your own data, please format to the example below. Currently we only support data with jsonl format. Each json should at least contain these attributes:

{
    "qid": 0,
    "text": "Is the sky blue?",
    "answer": ["Yes", "Yes the sky is blue"],
    "top_k_doc_id": [2, 8, 1, 10]
}

This should be under the --prompts_path for plan generation and selection.

Roadmap

  • Implement the Group Aware RR scheduler
  • Support online inference
  • Implement a faster prefix cache
  • Support multi-modality models

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragboost-0.0.1.tar.gz (3.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragboost-0.0.1-py3-none-any.whl (2.4 kB view details)

Uploaded Python 3

File details

Details for the file ragboost-0.0.1.tar.gz.

File metadata

  • Download URL: ragboost-0.0.1.tar.gz
  • Upload date:
  • Size: 3.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for ragboost-0.0.1.tar.gz
Algorithm Hash digest
SHA256 0a3cd8eaecb86877a8eaba8b5296e27cce32fcd2e2b661ab168ce3a4522bad96
MD5 badc57e6354a9ebaa7e076555e425648
BLAKE2b-256 ef5a6e986919f248792bcd68432b52fc5f7831ec019e839a9638155027450659

See more details on using hashes here.

File details

Details for the file ragboost-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: ragboost-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 2.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.18

File hashes

Hashes for ragboost-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 239d012774161659e186bf2c4f2c4e0af93fa38c272998e063a79854c6ea3992
MD5 222f4e3ed774234c6d1bedcd7d88f4b5
BLAKE2b-256 f0f47f0f81ef78f947f57f76c8664cf656464e544a75bbe074f2eef89dc71d8a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page