A package for creating retrieval-augmented generation apps

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

jdraines

These details have not been verified by PyPI

Project description

Ragmatic

Ragmatic is a low-code tool for creating RAG applications.

Installation

To install Ragmatic, clone this repo and use pip:

git clone https://github.com/jdraines/ragmatic.git
pip install -e "./ragmatic[hugging-face]"

Try it out

[!NOTE] In order to run these examples, you will need to have an OpenAI API key. We will set that key as an environment variable.

The examples/ directory contains a couple of examples to get you started. The first one to look at is examples/winnie_the_pooh.

First, cd into that directory:

cd examples/winnie_the_pooh

Ragmatic splits the RAG application into two main parts: A pipeline that generates embeddings from text, and then a RAG agent that uses those embeddings to answer questions.

We'll start by running the pipeline with the following command:

ragmatic run-pipeline -v local_documents_path=./book

This command will generate embeddings for the text in the book directory. The embeddings will be saved in a newly created ./data/ directory.

Now, to run the actual RAG, you will need to set the OPENAI_API_KEY environment variable to your OpenAI API key. For the purpose of this example, you can do that by running the following command:

export OPENAI_API_KEY=your_openai_api_key

Now you can run the RAG agent with the following command:

ragmatic rag-query \
    -v local_documents_path=./book \
    -v n_nearest=3 \
    --query "Who found Eeyore's tail? Can you quote the text that describes the event?"

A few things to note about the command above. We've set the local_documents_path again to the book directory. We've also set n_nearest to 3. This is a pretty low number, but our corpus only has 11 documents, so we want to demonstrate that the RAG agent can still find relevant documents even with a small number of documents. Finally, we've set the query parameter to the question we want to ask the RAG agent.

You should get a response that's something like:

Winnie-the-Pooh is the one who found Eeyore's tail. The text that describes the event is:

```txt
"Owl," said Pooh solemnly, "you made a mistake. Somebody did want it."

"Who?"

"Eeyore. My dear friend Eeyore. He was—he was fond of it."

"Fond of it?"

"Attached to it," said Winnie-the-Pooh sadly.
...
So with these words he unhooked it, and carried it back to Eeyore; and when Christopher Robin had nailed it on in its right place again, Eeyore frisked about the forest, waving his tail so happily that Winnie-the-Pooh came over all funny, and had to hurry home for a little snack of something to sustain him. And, wiping his mouth half an hour afterwards, he sang to himself proudly:
Who found the Tail?
"I," said Pooh,
"At a quarter to two
(Only it was quarter to eleven really),
I found the Tail!"
\```

Digging deeper

In the Winnie the Pooh example, we ran our rag agent using the default presets. (You can choose a preset by setting the --preset parameter to one of the presets in the src/ragmatic/cli/configuration/presets/preset_factory.py file. The local_docs preset is the default one, and there is currently one other, pycode.)

There is a ton of configuration that you can do with Ragmatic, however, if you use a config file. You can see an example of a project that uses a config file in the examples/hello_printer directory. The config file in that directory is example-config.yaml.

The Ragmatic configuration file is a yaml file that has the following top-level keys:

project_name: ""
components: {}
pipelines: {}
rag_query_command: {}

The project_name key is a string that is the name of the project. The other keys deserve some attention.

components

Ragmatic allows you to configure a number of reusable components. By declaring the components once in the components section, you can use them in multiple pipelines (or multiple times in the same pipeline). The components that are configured in Ragmatic are:

document_sources - literally, the sources of the documents
storage - storage abstractions that allow you to save and load data between pipeline steps and more long-term as well
llms - adapters for different language models. (Currently openai and anthropic are supported.)
summarizers - a component that allows you to generate new documents from existing ones (e.g. "describg this code file") using an llm
encoders - a component that allows you to generate embeddings from text
rag_agents - a component that puts together the RAG query process. In this component, you reference previously a defined llm, storage adapter, and encoder.

pipelines

Ragmatic allows you to configure multiple pipelines and then invoke them with the ragmatic run-pipeline [PIPELINE_NAME] command. A pipeline is a series of Actions that are executed in order. The Actions that are currently available are:

summarize - use an llm to summarize documents
encode - use an encoder model to generate embeddings from text
rag - use a rag agent to answer questions about the documents

rag_query_command

This key allows you to configure the ragmatic rag-query command by specifying the rag agent to use and the documents to use.

Running our hello_printer example

If you look at the configuration, you may notice that this RAG task is geared towards answering questions about a codebase. In order to do this efficiently, we need to first use an llm to summarize the code files and then use and encoder to generate embeddings from the summaries. This is what the pipeline in the example-config.yaml file does. That pipeline is named ingest-python-codebase in the config file.

To run the pipeline:

cd into the examples/hello_printer directory
ensure that you have the OPENAI_API_KEY environment variable set, and
use the following command:

ragmatic run-pipeline ingest-python-codebase --config ./example-config.yaml

This will generate embeddings for the code files in the ./src directory. The embeddings will be saved in the ./data/ directory.

Now, to run a RAG query, use the following command:

ragmatic rag-query --config $SCRIPT_DIR/example-config.yaml \
    --query "What appears to be the main purpose of this code?"

What's next?

I'm hoping to add a FastAPI app that will serve the RAG agent.
I'd like to write DataSource and Storage component adapters that will allow the use of a warehouse for documents and a key-value store like Redis for embeddings.
To be viable at scale, Ragmatic will need to be stateful in order to allow DataSources to provide incremental updates to the pipeline.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

jdraines

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.0b1 pre-release

Aug 25, 2024

0.2.0b0 pre-release

Aug 25, 2024

This version

0.1.0b1 pre-release

Aug 17, 2024

0.1.0b0 pre-release

Aug 17, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragmatic-0.1.0b1.tar.gz (28.9 kB view details)

Uploaded Aug 17, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ragmatic-0.1.0b1-py3-none-any.whl (40.8 kB view details)

Uploaded Aug 17, 2024 Python 3

File details

Details for the file ragmatic-0.1.0b1.tar.gz.

File metadata

Download URL: ragmatic-0.1.0b1.tar.gz
Upload date: Aug 17, 2024
Size: 28.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for ragmatic-0.1.0b1.tar.gz
Algorithm	Hash digest
SHA256	`c8fccb94c22b8390de4e67d99480a34689f812abd1355e41e3d296c5c17863ba`
MD5	`1fb744756eea5a82a46a819d370e68f0`
BLAKE2b-256	`54b07415ed3b3979f466399ff5f3e022392a719c4004c62b750b7c3a5cb52f63`

See more details on using hashes here.

File details

Details for the file ragmatic-0.1.0b1-py3-none-any.whl.

File metadata

Download URL: ragmatic-0.1.0b1-py3-none-any.whl
Upload date: Aug 17, 2024
Size: 40.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/5.1.0 CPython/3.12.5

File hashes

Hashes for ragmatic-0.1.0b1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8850b2b6979cee0cbf98a40dbbf41ee01db44a727451976bac7308f06583446c`
MD5	`e027244ea540c8661d354d0ab6223608`
BLAKE2b-256	`b790341ee14a12db03e0911095b39b7099a17a0a4d666d991b0ccb94282afce4`

See more details on using hashes here.

ragmatic 0.1.0b1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Ragmatic

Installation

Try it out

Digging deeper

components

pipelines

rag_query_command

Running our hello_printer example

What's next?

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes