Skip to main content

A project to show good CLI practices with a fully fledged RAG system.

Project description

RAG CLI

A project to show good CLI practices with a fully fledged RAG system.

Python version PyPI version GNU GPL

RAG CLI

Installation

pip install rag-cli

Features

  • CLI tooling for RAG
  • Embedder (Ollama)
  • Vector store (Qdrant)

Usage

Docker

If you don't have a running instance of Qdrant or Ollama, you can use the provided docker-compose file to start one.

docker-compose up --build -d

This will start Ollama on http://localhost:11434 and Qdrant on http://localhost:6333.

Development

This project uses a dev container, which is the easiest way to set up a consistent development environment. Dev containers provide all the necessary tools, dependencies, and configuration, so you can focus on coding right away.

Using Dev Containers

This project uses a dev container for a consistent development environment. To get started:

  1. Open the project in Visual Studio Code.
  2. On Windows/Linux, press Ctrl+Shift+P and run the command Remote-Containers: Reopen in Container. On Mac, press Cmd+Shift+P and run the same command.
  3. VS Code will build and start the dev container, providing access to the project's codebase and dependencies.

Other editors may have similar functionality but this project is optimised for Visual Studio Code.

Embedder

Before running this command, make sure you have a running instance of Ollama and the nomic-embed-text:v1.5 model is available:

ollama pull nomic-embed-text:v1.5
rag-cli embed --ollama-url http://localhost:11434 --file <INPUT_FILE>

You can alternatively use stdin to pass the text:

cat <INPUT_FILE> | rag-cli embed --ollama-url http://localhost:11434

Vector store

rag-cli vector-store \
--qdrant-url http://localhost:6333 \
--collection-name <COLLECTION_NAME> \
--data '{<JSON_DATA>}'
--embedding <EMBEDDING_FILE>

You can alternatively use stdin to pass embeddings:

cat <INPUT_FILE> | \
rag-cli vector-store \
--qdrant-url http://localhost:6333 \
--collection-name <COLLECTION_NAME> \
--data '{<JSON_DATA>}'

RAG Chat

rag-cli rag \
--ollama-embedding-url http://localhost:11434 \
--ollama-chat-url http://localhost:11435 \
--qdrant-url http://localhost:6333 \
--collection-name nomic-embed-text-v1.5 \
--top-k 5 \
--min-similarity 0.5
--file <INPUT_FILE>

You can alternatively use stdin to pass the text:

cat <INPUT_FILE> | \
--ollama-embedding-url http://localhost:11434 \
--ollama-chat-url http://localhost:11435 \
--qdrant-url http://localhost:6333 \
--collection-name nomic-embed-text-v1.5 \
--top-k 5 \
--min-similarity 0.5

End-to-end Pipeline For Storing Embeddings

Here is an example of an end-to-end pipeline for storing embeddings. It takes the following steps:

  • Get a random Wikipedia article
  • Embed the article
  • Store the embedding in Qdrant

Before running the pipeline make sure you have the following installed:

sudo apt-get update && sudo apt-get install parallel jq curl

Also make sure that the data/articles and data/embeddings directories exist:

mkdir -p data/articles data/embeddings

Then run the pipeline:

bash scripts/run_pipeline.sh

Parallel Pipeline

The script scripts/run_pipeline.sh can be run in parallel with GNU Parallel to speed up the process.

parallel -j 5 -n0 bash scripts/run_pipeline.sh ::: {0..10}

Examples

Get 10 Random Wikipedia Articles

parallel -n0 -j 10 '
curl -L -s "https://en.wikipedia.org/api/rest_v1/page/random/summary" | \
jq -r ".title, .description, .extract" | \
tee data/articles/$(cat /proc/sys/kernel/random/uuid).txt
' ::: {0..10}

Run Embedder On All Articles

parallel '
rag-cli embed --ollama-url http://localhost:11434 --file {1} 2>> output.log | \
jq ".embedding" | \
tee data/embeddings/$(basename {1} .txt) 1> /dev/null
' ::: $(find data/articles/*.txt)

Store All Embeddings In Qdrant

parallel rag-cli vector-store --qdrant-url http://localhost:6333 --collection-name nomic-embed-text-v1.5 2>> output.log ::: $(find data/embeddings/*)

Run RAG Chat On A Query

echo "Who invented the blue LED?" | \
rag-cli rag \
--ollama-embedding-url http://localhost:11434 \
--ollama-chat-url http://localhost:11435 \
--qdrant-url http://localhost:6333 \
--collection-name nomic-embed-text-v1.5 \
--top-k 5 \
--min-similarity 0.5 \
2>> output.log

This example obviously requires that the articles similar to the query have been embedded and stored in Qdrant. You can do this with the example found in the next section.

End-to-end Pipeline For A Single Article

wikipedia_data=$(curl -L -s "https://en.wikipedia.org/api/rest_v1/page/summary/Shuji_Nakamura") && \
payload_data=$(jq "{title: .title, description: .description, extract: .extract}"  <(echo $wikipedia_data)) && \
text_to_embed=$(jq -r ".title, .description, .extract" <(echo $wikipedia_data)) && \
echo $text_to_embed | \
rag-cli embed --ollama-url http://localhost:11434 | \
jq -r ".embedding" | \
rag-cli vector-store \
  --qdrant-url http://localhost:6333 \
  --collection-name nomic-embed-text-v1.5 \
  --data "$payload_data"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rag_cli-0.3.1.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

rag_cli-0.3.1-py3-none-any.whl (19.9 kB view details)

Uploaded Python 3

File details

Details for the file rag_cli-0.3.1.tar.gz.

File metadata

  • Download URL: rag_cli-0.3.1.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.4

File hashes

Hashes for rag_cli-0.3.1.tar.gz
Algorithm Hash digest
SHA256 fb6c24a2f27d0471aa6ac61cda06cd68293aa672f8bbce69913e645e1a541d9d
MD5 157028a24e47ab28b7e508f4fee9106b
BLAKE2b-256 57d44994902da9fac1520ca9fd833ababee7bc7ab511746c642cf0c0796da173

See more details on using hashes here.

File details

Details for the file rag_cli-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: rag_cli-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 19.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.4

File hashes

Hashes for rag_cli-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d7d0f987197e8a3815dd023c0eaec541fb9ce8b7bfe358c924015dcc5ad156ee
MD5 39dd9ccfd7e4c71facacd1243296b2a5
BLAKE2b-256 029c039a3abfc9dd3845e2b7d12f87e09a3d42ba399afb1ff100af975fb658b4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page