Skip to main content

A modular Python package for implementing Retrieval Augmented Generation chains for the chATLAS project.

Project description

chATLAS_Chains

This package implements and benchmarks various Retrieval Augmented Generation (RAG) chains for use in the chATLAS project.

Installation

From PyPI

pip install chATLAS-Chains

From source

We recommend using uv

cd chATLAS_Chains
uv sync

Environment variables

These are required for the following use cases

  1. Using an OpenAI LLM
export CHATLAS_OPENAI_KEY="your api key"
  1. Using LLMs via the Groq API
export CHATLAS_GROQ_BASE_URL="http://cs-513-ml003:3000"
export CHATLAS_GROQ_KEY="your groq api key"

note The API address is local to the CERN network. If not at CERN, you can forward it like so:

ssh -L 3000:cs-513-ml003:3000 $LXPLUS_USERNAME@lxplus.cern.ch
export CHATLAS_GROQ_BASE_URL="http://localhost:3000"
  1. Using LLMs via CERN's LiteLLM API, here is the repo and some setup instructions for reference.
export CHATLAS_CHAINS_LITELLM_KEY="your litellm key"

Supported Chains

More details here

  • chains.basic.basic_retrieval_chain
  • chains.advanced.advanced_rag

Model Configuration in Chains

Supported chain constructors now accept a typed chat_model_kwargs argument for model options (for example: temperature, max_tokens, service_provider, api_key, base_url, proxy).

from chATLAS_Chains.chains.basic import basic_retrieval_chain

chain = basic_retrieval_chain(
    prompt=...,
    vectorstore=...,
    model_name="gpt-4o-mini",
    chat_model_kwargs={"temperature": 0.1, "max_tokens": 512},
)

Forwarding vectorstore connections

If not on the CERN network, you can forward the connection to the postgres servers with:

ssh -N \         
  -L 6624:dbod-chatlas.cern.ch:6624 \
  -L 6606:dbod-chatlas-cds.cern.ch:6606 \
  "$LXPLUS_USERNAME"@lxplus.cern.ch 
export CHATLAS_PORT_FORWARDING=1

You can then the helper function get_vectorstore

Testing Environment Variables

Some tests are DB-backed integration tests (tests/test_chains.py, tests/test_conversational.py, tests/test_search.py). If the DB/test environment is not configured, these tests are skipped by tests/conftest.py.

tests/conftest.py now uses explicit controls:

  • CHATLAS_PORT_FORWARDING: enable localhost DB tunnels (1, true, True)
  • CHATLAS_DB_PASSWORD

Local Example (with DB tunnels)

export CHATLAS_DB_PASSWORD="..."
export CHATLAS_PORT_FORWARDING=1
unset GITLAB_PAT

uv run pytest -q

Postgres

If you want to create a local postgres server, you need to install psql. Some instructions to do this on macOS using homebrew are here:

Software install

brew install postgresql
brew services start postgresql
brew install pgvector
brew unlink pgvector && brew link pgvector

Create a user

psql -h localhost -U postgres
ALTER USER postgres WITH PASSWORD 'Set_your_password_here';
CREATE EXTENSION IF NOT EXISTS vector;

CHANGELOG

0.1.7

Support for CERN-hosted LiteLLM models

Multi-turn conversational RAG with (local) conversation history

Bugfixes

0.1.6

Fix bug in reciprocal_rank_fusion which caused it to silently return only one document

Add fallback_models optional argument to advanced_rag

0.1.5

Fix missing retry_config argument in advanced_rag caused by early PyPI upload

0.1.4

Support for Groq-hosted models

Some new functions that go beyond the "basic RAG" workflow:

  • Reciprocal Rerank Fusion chATLAS_Chains.documents.rrf.reciprocal_rank_fusion
  • Document reranking via the Pinecone API chATLAS_Chains.documents.rerank.rerank_documents
  • Query rewriting step chATLAS_Chains.query.query_rewriting.rewrite_query

These are all usable via the new chain chATLAS_Chains.chains.advanced.advanced_rag

Added unit tests to gitlab CI/CD pipeline

0.1.3

Fixing imports

Changed output format of basic_retrieval_chain (docs key is now a list of Document objects, rather than a dict)

Unit tests for basic_retrieval_chain

0.1.2

Unit tests

First Langgraph chain

0.1.1

Initial Release


📄 License

chATLAS_Benchmark is released under Apache v2.0 license.


Made with ❤️ by the ATLAS Collaboration

For questions and support, please contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chatlas_chains-0.2.0.tar.gz (67.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chatlas_chains-0.2.0-py3-none-any.whl (78.6 kB view details)

Uploaded Python 3

File details

Details for the file chatlas_chains-0.2.0.tar.gz.

File metadata

  • Download URL: chatlas_chains-0.2.0.tar.gz
  • Upload date:
  • Size: 67.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for chatlas_chains-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c9277c76a236423ebd60e27cf4e8065abf945b7eb3ebe67d6951035db311c544
MD5 c379f0b1c3ec29bf793ce5801d311002
BLAKE2b-256 f81cac3e91991bfa50faf403a4b5421f06ba6fcf70f5ed5791a3c42dd90c9a85

See more details on using hashes here.

File details

Details for the file chatlas_chains-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: chatlas_chains-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 78.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for chatlas_chains-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d205a0e958fc9cf96c7a1290d228cedce8bded8707d7a00aff764d2868b6a804
MD5 86a136aaae221cfe9bfaf0fd672c8caf
BLAKE2b-256 37a4875cfc298b14944f363ac526e2d0a6f0d70c9a6c44bf83d1249af9ecdaf5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page