Skip to main content

This project I use a lot for workshops, it contains some utils for splitters, tokenizers, and a weaviate client that I reuse a lot

Project description

RAG4P - Retrieval Augmented Generation for Python

Welcome to the repository for our project RAG4P.org. This project is a Python implementation of the Retrieval Augmented Generation framework. It is a framework that is simple to use and understand. But powerful enough to extend for your own projects.

Setting up your environment

Python

We encourage you to use a python environment manager. Poetry makes it easy to use multiple python versions and packages. where you can switch versions per project. Read this Poetry documentation page to learn how to set up your environment. No poetry installed? Read this page to install it for your environment. Poetry installation

Setting the right version of python for the project

poetry env use 3.10

Install dependencies

poetry install

Run the project

poetry run python rag4p/app_step1_chunking_strategy.py

No poetry

Setup your venv

python3 -m venv venv
source venv/bin/activate

Install dependencies

pip install -r poetry-requirements.txt

Loading API keys

We try to limit accessing Large Language Models and vector stores to a minimum. You do not need an LLM or vector store to learn about all the elements of the Retrieval Augmented Generation framework, except for the generation part. In the workshop we use the LLM of Open AI, which is not publicly available. We will provide you with a key to access it, if you don't have your own key.

Please use this key for the workshop only, and limit the amount of interaction, or we get blocked for exceeding our limits. The API key is obtained through a remote file, which is encrypted. Of course you can also use your own key if you have it.

Environment variables

The easiest way to load the API key is to set an environment variable for each required key. In Python we prefer the file .env.properties in the root of the project with the following properties:

OPENAI_API_KEY=sk-...
WEAVIATE_API_KEY=...
WEAVIATE_URL=...

If you do not have your own key, you can load ours. The key is stored in a remote location. You need the .env.properties file in the root of the project with the following line:

SECRET_KEY=...

This secret key is used to decrypt the remote file containing the API keys. We will provide the value for this key during the workshop.

Using Ollama

There is a simple way to run a Language Model on your local machine. Depending on your machine and the chosen model, it runs fast. I am not going in to much details on how to install it, but you can find the installation instructions on the Ollama Downloads page.

At the moment we prefer the model Phi 3. You can learn more about the model on the Ollama Models page. A lot of other models are available as well. You can try them out yourself. Make sure you pull the model first before you can use it. You can also use Ollama for the embeddings. We advice to pull the model nomic-embed-text for this purpose.

ollama pull phi3
ollama pull nomic-embed-text

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag4p-0.5.1.tar.gz (25.1 kB view details)

Uploaded Source

Built Distribution

rag4p-0.5.1-py3-none-any.whl (49.3 kB view details)

Uploaded Python 3

File details

Details for the file rag4p-0.5.1.tar.gz.

File metadata

  • Download URL: rag4p-0.5.1.tar.gz
  • Upload date:
  • Size: 25.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/23.5.0

File hashes

Hashes for rag4p-0.5.1.tar.gz
Algorithm Hash digest
SHA256 b2255b0ba2b063e7252fcb0bb096d29eadfa9d110b29c4e5f7c433e81e23348b
MD5 fee5d7099202a72a1f4ad5f9f526049b
BLAKE2b-256 6d720d9866f6658a22953f10584e5d71842c9c54125515c4bfcfda85aacae831

See more details on using hashes here.

File details

Details for the file rag4p-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: rag4p-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 49.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/23.5.0

File hashes

Hashes for rag4p-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 da17e25c5cdbcb5793f3dc1838d8db8443fc5e409b6238b74b8796047f9f3984
MD5 c29b0e3d1b6fc8bcaf8f3aff6d57615d
BLAKE2b-256 c465939d16c83fe9601e0dc6cce32edc56b6e2971ff7730134dec26719b869e8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page