Skip to main content

SciPhi R2R

Project description

R2R

R2R (RAG to Riches) is a Python framework designed for the rapid construction and deployment of production-ready Retrieval-Augmented Generation (RAG) systems. This semi-opinionated framework accelerates the transition from experimental stages to production-grade RAG systems.

Quick Install:

Install R2R directly using pip:

pip install r2r

Full Install:

Follow these steps to ensure a smooth setup:

  1. Install Poetry:

    • Before installing the project, make sure you have Poetry on your system. If not, visit the official Poetry website for installation instructions.
  2. Clone and Install Dependencies:

    • Clone the project repository and navigate to the project directory:
      git clone git@github.com:SciPhi-AI/r2r.git
      cd r2r
      
    • Install the project dependencies with Poetry:
      # See pyproject.toml for available extras
      # use "all" to include every optional dependency
      poetry install --extras "parsing"
      
  3. Configure Environment Variables:

    • You need to set up cloud provider secrets in your .env. At a minimum, you will need an OpenAI key.
    • The framework currently supports pgvector and Qdrant with plans to extend coverage.
    • If starting from the example, copy .env.example to .env to apply your configurations:
      cp .env.example .env
      

Basic Examples

The project includes several basic examples that demonstrate application deployment and standalone usage of the embedding and RAG pipelines:

  1. app.py: This example runs the main application, which includes the ingestion, embedding, and RAG pipelines served via FastAPI.

    poetry run uvicorn examples.basic.app:app
    
  2. run_client.py: This example should be run after starting the main application. It demonstrates uploading text entries as well as a PDF with the python client. Further, it shows document and user-level management with built-in features.

    poetry run python -m examples.client.test_client
    
  3. run_pdf_chat.py: A more comprehensive example demonstrating upload and chat with a more realistic pdf.

    # Ingest pdf
    poetry run python -m examples.pdf_chat.run_demo ingest
    
    # Ask a question
    poetry run python -m examples.pdf_chat.run_demo search "What are the key themes of Meditations?"
    
  4. web: A web application which is meant to accompany the framework to provide visual intelligence.

    cd web && pnpm install
    # Serve the web app
    pnpm dev
    

Demonstration

https://github.com/SciPhi-AI/r2r/assets/68796651/c648ab67-973a-416a-985e-2eafb0a41ef0

Community

Join our Discord server!

Core Abstractions

The framework primarily revolves around three core abstractions:

  • The Ingestion Pipeline: Facilitates the preparation of embeddable 'Documents' from various data formats (json, txt, pdf, html, etc.). The abstraction can be found in ingestion.py.

  • The Embedding Pipeline: Manages the transformation of text into stored vector embeddings, interacting with embedding and vector database providers through a series of steps (e.g., extract_text, transform_text, chunk_text, embed_chunks, etc.). The abstraction can be found in embedding.py.

  • The RAG Pipeline: Works similarly to the embedding pipeline but incorporates an LLM provider to produce text completions. The abstraction can be found in rag.py.

Each pipeline incorporates a logging database for operation tracking and observability.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

r2r-0.1.1.tar.gz (47.3 kB view hashes)

Uploaded Source

Built Distribution

r2r-0.1.1-py3-none-any.whl (64.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page