Project description

Just RAG

This library simplifies the process of using Retrieval-Augmented Generation (RAG). Focus on the result you want to achieve and let the library handle the rest.

Based on LangChain / LangGraph
Have an unified input/output signature across different RAG Strategies
Support offline / local inference through LLamaCPP & langchain_llamacpp_chat_model

If you find this project useful, please give it a star ⭐!

Full Stack Rag

Persist & cync documents when file changes.

builder = JustChromaVectorStoreBuilder(
    collection_name="droits_canadiens",
    file_or_urls=["./tests/assets/Charte canadienne des droits et libertés.html"], # Any file or url
    record_manager_db_url="sqlite:///_record_manager_cache.sql", # Any SQL Alchemy compatible URL
    chroma_persist_directory="./tests_chroma_db", # Any ChromaDB compatible URL
)

retriever = builder.get_retriever()
chain = CitedClassicRag(llm=openai_llm, retriever=retriever).build()

result = chain.invoke(
    {
        "input": "En temps que citoyen, est-ce que j'ai le droit d'entrer et sortir du canada quand je veux? Repondre oui ou non.",
    }
)

assert "non" in result["result"].result.lower()

Full example: tests/test_functional/test_functional_full_stack_rag.py

Remote inference

Classic Rag

from just_rag import ClassicRag
from langchain_openai import ChatOpenAI
from langchain_community.retrievers import WikipediaRetriever

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=temperature)
retriever = WikipediaRetriever(top_k_results=6, doc_content_chars_max=1000)

chain = ClassicRag(llm=llm, retriever=retriever).build()
result = chain.invoke({"input": "How fast are cheetahs?"})

print(result["result"])

Full example: tests/test_functional/test_functional_classic_rag.py

Classic Rag with Citation

from just_rag import CitedClassicRag
from just_rag.citation import BaseCitation, BaseCitedAnswer
from langchain_openai import ChatOpenAI
from langchain_community.retrievers import WikipediaRetriever

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=temperature)
retriever = WikipediaRetriever(top_k_results=6, doc_content_chars_max=1000)

class Citation(BaseCitation):
    page_content: str = Field(
        ...,
        description="Page content from the specified source that justifies the answer.",
    )
    title: str = Field(
        ...,
        description="The TITLE quote from the specified source that justifies the answer.",
    )


class CitedAnswer(BaseCitedAnswer[Citation]):
    citations: List[Citation] = Field(
        ..., description="Citations from the given sources that justify the answer."
    )


chain = CitedClassicRag(llm=llm, retriever=retriever, schema=CitedAnswer).build()
result = chain.invoke({"input": "How fast are cheetahs?"})

print(result["result"].result)
print(result["result"].citations)

Full example: tests/test_functional/test_functional_cited_classic_rag.py

Agentic RAG - Self Rag (with Citation)

from just_rag import SelfRagGraphBuilder
from just_rag.citation import BaseCitation, BaseCitedAnswer
from langchain_openai import ChatOpenAI
from langchain_community.retrievers import WikipediaRetriever

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=temperature)
retriever = WikipediaRetriever(top_k_results=6, doc_content_chars_max=1000)

class Citation(BaseCitation):
    page_content: str = Field(
        ...,
        description="Page content from the specified source that justifies the answer.",
    )
    title: str = Field(
        ...,
        description="The TITLE quote from the specified source that justifies the answer.",
    )


class CitedAnswer(BaseCitedAnswer[Citation]):
    citations: List[Citation] = Field(
        ..., description="Citations from the given sources that justify the answer."
    )


chain = SelfRagGraphBuilder(llm=llm, retriever=retriever, schema=CitedAnswer).build()
result = chain.invoke({"input": "How fast are cheetahs?"})

print(result["result"])
print(result["documents"][0].metadata['title'])
print(result["documents"][0].metadata['source'])
print(result["documents"][0].metadata['summary'])
print(result["result"].citations[0].source_id)
print(result["result"].citations[0].title)
print(result["result"].citations[0].page_content)

Full example: tests/test_functional/test_functional_self_rag.py

Local Inference

Using LLamaCPP & langchain_llamacpp_chat_model

from just_rag import SelfRagGraphBuilder
from langchain_llamacpp_chat_model import LlamaChatModel
from llama_cpp import Llama
from langchain_community.retrievers import WikipediaRetriever

model_path = os.path.join(
    os.path.expanduser("~/.cache/lm-studio/models"),
    "lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF/Meta-Llama-3-8B-Instruct-Q4_K_M.gguf",
)
llama = Llama(
    verbose=True,
    model_path=model_path,
    n_ctx=8192,  # Meta-Llama-3-8B has a maximum context size of 8192
    n_batch=512,
    n_gpu_layers=-1,  # -1 is all on GPU
    n_threads=4,
    use_mlock=True,
    chat_format="chatml-function-calling",
)
llm = LlamaChatModel(llama=llama, temperature=0.0)

# The number of retreived documents should be inferior to the local llm context size.
# top_k_results * doc_content_chars_max < n_ctx
retriever = WikipediaRetriever(top_k_results=6, doc_content_chars_max=1000)

chain = SelfRagGraphBuilder(llm=llm, retriever=retriever).build()
result = chain.invoke({"input": "How fast are cheetahs?"})

print(result["result"])

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

Jul 16, 2024

0.1.0

Jul 15, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

just_rag-0.2.0.tar.gz (11.1 kB view hashes)

Uploaded Jul 16, 2024 Source

Built Distribution

just_rag-0.2.0-py3-none-any.whl (18.7 kB view hashes)

Uploaded Jul 16, 2024 Python 3

Hashes for just_rag-0.2.0.tar.gz

Hashes for just_rag-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`1c29418af5708cb3ed733b6a4df21cd3fd44c32d532cb091b5c1384f213e6c97`
MD5	`100353898e6a036aee0c727de4eb5817`
BLAKE2b-256	`f3f21a61416ee4b48738beecfde3fa8ba81534e0f5b64b6dd9cd39f737711710`

Hashes for just_rag-0.2.0-py3-none-any.whl

Hashes for just_rag-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bd953526a310fe8f78d21e15e78749610848a6d9250944986f0e3f870e283ed1`
MD5	`210320027501b7c85af825c0e0bde4c7`
BLAKE2b-256	`cac75e6e80e707a08b274748c850895d4a463fe7a00673a999e7ebf95490e43d`