Skip to main content

build LLM workflows

Project description

test PyPI

llm-workflow

llm-workflow is a light framework for using LLMs. It contains classes for models (e.g. OpenAI) as well as constructing 'workflows' and 'agents'. The model classes implement memory tracking, making it an easy way to have conversations with LLMs.

Here is a quick example showing how to use OpenAIChat class. More complex examples are shown below and in the examples folder.

from llm_workflow.openai import OpenAIChat

model = OpenAIChat()
model("What is the capital of France?")

The capital of France is Paris.

print(f"Total Cost:            ${model.cost:.5f}")
print(f"Total Tokens:          {model.total_tokens:,}")
print(f"Total Prompt Tokens:   {model.input_tokens:,}")
print(f"Total Response Tokens: {model.response_tokens:,}")

Total Cost: $0.00004 Total Tokens: 31 Total Prompt Tokens: 24 Total Response Tokens: 7


A workflow is an object that executes a sequence of tasks. Each task is a callable that can optionally track history. The output of one task serves as the input to the next task in the workflow. Pretty simple.

The purpose of this library is to offer a simple pattern for developing LLM workflows. First, it reduces the need for users to write repetitive/boilerplate code. Second, by establishing a standardized interface for tasks (e.g. specifying how a task tracks history), a workflow can serve as a means of aggregating information from all tasks, such as token usage, costs, and more. Additionally, this approach enables us to examine each step within the workflow and within a specific task, making the workflow transparent and facilitating debugging and comprehension.


Here's an example of a simple "prompt enhancer". The user provides a prompt; the first OpenAI model enhances the user's prompt and the second OpenAI model provides a response based on the enhanced prompt. This example is described in greater detail in the Examples section below and the corresponding notebook.

Here's the user's original prompt:

prompt = "create a function to mask all emails from a string value"

The prompt variable is passed to the workflow object, which gets passed to the first task (the prompt_template function).

prompt_enhancer = OpenAIChat(...)
chat_assistant = OpenAIChat(...)

def prompt_template(user_prompt: str) -> str:
    return "Improve the user's request, below, by expanding the request " \
        "to describe the relevant python best practices and documentation " \
        f"requirements that should be followed:\n\n```{user_prompt}```"

def prompt_extract_code(_) -> str:
    # `_` signals that we are ignoring the input (from the previous task)
    return "Return only the primary code of interest from the previous answer, "\
        "including docstrings, but without any text/response."

workflow = Workflow(tasks=[
    prompt_template,      # modifies the user's prompt
    prompt_enhancer,      # returns an improved version of the user's prompt
    chat_assistant,       # returns the chat response based on the improved prompt
    prompt_extract_code,  # prompt to ask the model to extract only the relevant code
    chat_assistant,       # returns only the relevant code from the model's last response
])
response = workflow(prompt)

print(response)                         # ```python\n def mask_email_addresses(string): ...
print(workflow.sum('cost'))             # 0.0034
print(workflow.sum('total_tokens'))     # 1961
print(workflow.sum('input_tokens'))    # 1104
print(workflow.sum('response_tokens'))  # 857
print(workflow.history())               # list of Record objects containing prompt/response/usage

See Examples section below for full output and explanation.


Note: Since the goal of this library is to provide a simple pattern for creating workflows and tracking history, some of the classes provided in this library (e.g. OpenAIEmbedding or ChromaDocumentIndex) are nothing more than simple wrappers that implement the interface necessary to track history in a consistent way, allowing the workflow to aggregate the history across all tasks. This makes it easy for people to create their own wrappers and workflows.

See examples below.


Installing

Note: This package is tested on Python versions 3.10 and 3.11

pip install llm-workflow

API Keys

  • Any code in this library that uses OpenAI assumes that the OPENAI_API_KEY environment variable is set to a valid OpenAI API key. This includes many of the notebooks in the examples directory.
  • The llm_workflow.utilities.StackOverflowSearch class assumes that the STACK_OVERFLOW_KEY environment variable is set. To use that class, you must create an account and app at Stack Apps and use the key that is generated (not the secret) to set the environment variable.

Here are two ways you can set environment variables directly in Python:

import os
os.environ['OPENAI_API_KEY'] = 'sk-...'

Or you can create a file named .env (in your project/notebook directory) that contains the key(s) and value(s) in the format below, and use the dotenv library and load_dotenv function to load the keys/values as environment variables.

OPENAI_API_KEY=sk-...
STACK_OVERFLOW_KEY=...
from dotenv import load_dotenv
load_dotenv()  # sets any environment variables from the .env file

Examples

Example 1

Here's the full example from the snippet above (alternatively, see this notebook). Note that this example is not meant to provide prompt-engineering best practices, it's simply to show how workflows work in this library.

  • first task: defines a prompt-template that takes the user's prompt, and creates a new prompt asking a chat model to improve the prompt (within the context of creating python code)
  • second task: the prompt_enhancer model takes the modified prompt and improves the prompt
  • third task: the chat_assistant model takes the response from the last model (which is an improved prompt) and returns the request
  • fourth task: ignores the response from the chat model; creates a new prompt asking the chat model to extract the relevant code created in the previous response
  • fifth task: the chat model, which internally maintains the history of messages, returns only the relevant code from the previous response.
from llm_workflow.base import Workflow
from llm_workflow.openai import OpenAIChat

prompt_enhancer = OpenAIChat()
# different model/object, therefore different message history (i.e. conversation)
chat_assistant = OpenAIChat()

def prompt_template(user_prompt: str) -> str:
    return "Improve the user's request, below, by expanding the request " \
        "to describe the relevant python best practices and documentation " \
        f"requirements that should be followed:\n\n```{user_prompt}```"

def prompt_extract_code(_) -> str:
    # `_` signals that we are ignoring the input (from the previous task)
    return "Return only the primary code of interest from the previous answer, "\
        "including docstrings, but without any text/response."

# The only requirement for the list of tasks is that each item/task is a
# callable where the output of one task matches the input of the next task.
# The input to the workflow is passed to the first task;
# the output of the last task is returned by the workflow.
workflow = Workflow(tasks=[
    prompt_template,      # modifies the user's prompt
    prompt_enhancer,      # returns an improved version of the user's prompt
    chat_assistant,       # returns the chat response based on the improved prompt
    prompt_extract_code,  # prompt to ask the model to extract only the relevant code
    chat_assistant,       # returns only the function from the model's last response
])
prompt = "create a function to mask all emails from a string value"
response = workflow(prompt)
print(response)

The output of the workflow (response):

import re

def mask_email_addresses(string):
    """
    Mask email addresses within a given string.

    Args:
        string (str): The input string to be processed.

    Returns:
        str: The modified string with masked email addresses.

    Raises:
        ValueError: If the input is not a string.

    Dependencies:
        The function requires the 're' module to use regular expressions.
    """
    if not isinstance(string, str):
        raise ValueError("Input must be a string")

    pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b'
    masked_string = re.sub(pattern, '[email protected]', string)

    return masked_string

Total costs/tokens for all activity in the workflow:

print(f"Cost:             ${workflow.sum('cost'):.4f}")
print(f"Total Tokens:      {workflow.sum('total_tokens'):,}")
print(f"Prompt Tokens:     {workflow.sum('input_tokens'):,}")
print(f"Response Tokens:   {workflow.sum('response_tokens'):,}")

Output:

Cost:            $0.0034
Total Tokens:     1,961
Prompt Tokens:    1,104
Response Tokens:  857

We can view the history of the workflow (i.e. the aggregated history across all tasks) with the workflow.history() method.

In this example, the only class that tracks history is OpenAIChat. Therefore, both the prompt_enhancer and chat_assistant objects will contain history. workflow.history() will return a list of three ExchangeRecord objects. The first record corresponds to our request to the prompt_enhancer, and the second two records correspond to our chat_assistant requests. An ExchangeRecord represents a single exchange/transaction with an LLM, encompassing an input (prompt) and its corresponding output (response), along with other properties like cost and token_tokens.

We can view the response we received from the prompt_enhancer model by looking at the first record's response property (or the second record's prompt property since the workflow passes the output of prompt_enhancer as the input to the chat_assistant):

print(workflow.history()[0].response)

Output:

Create a Python function that adheres to best practices and follows proper documentation guidelines to mask all email addresses within a given string value. The function should take a string as input and return the modified string with masked email addresses.

To ensure code readability and maintainability, follow these best practices:

1. Use meaningful function and variable names that accurately describe their purpose.
2. Break down the problem into smaller, reusable functions if necessary.
3. Write clear and concise code with proper indentation and comments.
4. Handle exceptions and errors gracefully by using try-except blocks.
5. Use regular expressions to identify and mask email addresses within the string.
6. Avoid using global variables and prefer passing arguments to functions.
7. Write unit tests to verify the correctness of the function.

In terms of documentation, follow these guidelines:

1. Provide a clear and concise function description, including its purpose and expected behavior.
2. Specify the input parameters and their types, along with any default values.
3. Document the return value and its type.
4. Mention any exceptions that the function may raise and how to handle them.
5. Include examples of how to use the function with different inputs and expected outputs.
6. Document any dependencies or external libraries required for the function to work.

By following these best practices and documentation guidelines, you will create a well-structured and maintainable Python function to mask email addresses within a string value.

We could also view the original response from the chat_assistant model.

mprint(workflow.history()[1].response)

Output:

Sure! Here's an example of a Python function that adheres to best practices and follows proper documentation guidelines to mask email addresses within a given string value:

import re

def mask_email_addresses(string):
    """
    Mask email addresses within a given string.

    Args:
        string (str): The input string to be processed.

    Returns:
        str: The modified string with masked email addresses.

    Raises:
        ValueError: If the input is not a string.

    Examples:
        >>> mask_email_addresses("Contact us at john.doe@example.com")
        'Contact us at [email protected]'

        >>> mask_email_addresses("Email me at john.doe@example.com or jane.doe@example.com")
        'Email me at [email protected] or [email protected]'

        >>> mask_email_addresses("No email addresses here")
        'No email addresses here'

    Dependencies:
        The function requires the 're' module to use regular expressions.
    """
    if not isinstance(string, str):
        raise ValueError("Input must be a string")

    # Regular expression pattern to match email addresses
    pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b'

    # Replace email addresses with masked version
    masked_string = re.sub(pattern, '[email protected]', string)

    return masked_string

In this example, the mask_email_addresses function takes a string as input and returns the modified string with masked email addresses. It uses the re module to match email addresses using a regular expression pattern. The function raises a ValueError if the input is not a string.

The function is properly documented with a clear and concise description, input parameters with their types, return value and its type, exceptions that may be raised, examples of usage, and any dependencies required.

To use this function, you can call it with a string as an argument and it will return the modified string with masked email addresses.

The final response returned by the chat_assistant (and by the workflow object) returns only the mask_email_addresses function. The response object should match the response value in the last record (workflow.history()[-1].response).

assert response == workflow.history()[-1].response  # passes

Example 2

Here's an example of using a workflow to perform the following series of tasks:

  • Ask a question.
  • Perform a web search based on the question.
  • Scrape the top_n web pages from the search results.
  • Split the web pages into chunks (so that we can search for the most relevant chunks).
  • Save the chunks to a document index (i.e. vector database).
  • Create a prompt that includes the original question and the most relevant chunks.
  • Send the prompt to the chat model.

Again, the key concept of a workflow is simply that the output of one task is the input of the next task. So, in the code below, you can replace any step with your own implementation as long as the input/output matches the task you replace.

Something that may not be immediately obvious is the usage of the Value object, below. It serves as a convenient caching mechanism within the workflow. The Value object is callable, allowing it to cache and return the value when provided as an argument. When called without a value, it retrieves and returns the cached value. In the example below, the Value object is used to cache the original question, pass it to the web search (i.e. the duckduckgo_search object), and subsequently reintroduce the question into the workflow (i.e. into the prompt_template object).

See this notebook for an in-depth explanation.

from llm_workflow.base import Document, Workflow, Value
from llm_workflow.openai import OpenAIEmbedding, OpenAIChat
from llm_workflow.utilities import DuckDuckGoSearch, scrape_url, split_documents
from llm_workflow.indexes import ChromaDocumentIndex
from llm_workflow.prompt_templates import DocSearchTemplate

duckduckgo_search = DuckDuckGoSearch(top_n=3)
embeddings_model = OpenAIEmbedding(model_name='text-embedding-ada-002')
document_index = ChromaDocumentIndex(embeddings_model=embeddings_model, n_results=3)
prompt_template = DocSearchTemplate(doc_index=document_index, n_docs=3)
chat_model = OpenAIChat()

def scrape_urls(search_results):
    """
    For each url (i.e. `href` in `search_results`):
    - extracts text
    - replace new-lines with spaces
    - create a Document object
    """
    return [
        Document(content=re.sub(r'\s+', ' ', scrape_url(x['href'])))
        for x in search_results
    ]

question = Value()  # Value is a caching/reinjection mechanism; see note above

# each task is a callable where the output of one task is the input to the next task
workflow = Workflow(tasks=[
    question,
    duckduckgo_search,
    scrape_urls,
    split_documents,
    document_index,
    question,
    prompt_template,
    chat_model,
])
workflow("What is ChatGPT?")

Response:

ChatGPT is an AI chatbot that is driven by AI technology and is a natural language processing tool. It allows users to have human-like conversations and can assist with tasks such as composing emails, essays, and code. It is built on a family of large language models known as GPT-3 and has now been upgraded to GPT-4 models. ChatGPT can understand and generate human-like answers to text prompts because it has been trained on large amounts of data.'

We can also track costs:

print(f"Cost:            ${workflow.sum('cost'):.4f}")
print(f"Total Tokens:     {workflow.sum('total_tokens'):,}")
print(f"Prompt Tokens:    {workflow.sum('input_tokens'):,}")
print(f"Response Tokens:  {workflow.sum('response_tokens'):,}")
# to get the number of embedding tokens, sum `total_tokens` across only EmbeddingRecord objects
print(f"Embedding Tokens: {workflow.sum('total_tokens', types=EmbeddingRecord):,}")

Output:

Cost:            $0.0024
Total Tokens:     16,108
Prompt Tokens:    407
Response Tokens:  97
Embedding Tokens: 15,604

Additionally, we can track the history of the workflow with the workflow.history() property. See this notebook for an example.

Compare LLM Models

You can compare the responses of various LLM models with the following code (see examples/compare.ipynb for the output, which is in html.)

import os
from llm_workflow.openai import OpenAIChat
from llm_workflow.hugging_face import HuggingFaceEndpointChat
from llm_workflow.compare import CompareModels, ModelDefinition

# Comparing GPT-3.5 Turbo, GPT-4, and Mistral 7B
model_definitions = [
    ModelDefinition(
        create=lambda: OpenAIChat(model_name='gpt-3.5-turbo-0125'),
        description='GPT-3.5 Turbo',
    ),
    ModelDefinition(
        create=lambda: OpenAIChat(model_name='gpt-4-1106-preview'),
        description='GPT-4 Turbo',
    ),
    ModelDefinition(
        create=lambda: HuggingFaceEndpointChat(
            endpoint_url=os.getenv('HUGGING_FACE_ENDPOINT_MISTRAL_7B'),
        ),
        description='Mistral 7B (HF Endpoint)',
    ),
]
prompts = [
    # scenario 1
    [
        # initial prompt
        "Write a function that returns the sum of two numbers.",
        # follow up question
        "Write a few assertion statements validating the function.",
    ],
    # scenario 2
    [
        # initial prompt
        "Generate a function that masks email addresses.",
        # follow up question
        "Write a few assertion statements validating the function.",
    ]
]
comparison = CompareModels(prompts=prompts, model_definitions=model_definitions)
comparison()
display(HTML(comparison.to_html()))

Notebooks


Possible Features

  • Add metadata property to OpenAIChat which gets passed to the ExchangeRecord that are created (so that user can pass metadata)
  • Explore adding additional 'agents'
  • Create PDF-Loader
  • create additional prompt-templates

Contributing

Contributions to this project are welcome. Please follow the coding standards, add appropriate unit tests, and ensure that all linting and tests pass before submitting pull requests.

Coding Standards

  • Coding standards should follow PEP 8 (Style Guide for Python Code).
  • Document all files, classes, and functions following the existing documentation style.

Docker

See the Makefile for all available commands.

To build the Docker container:

make docker_build

To run the terminal inside the Docker container:

make docker_zsh

To run the unit tests (including linting and doctests) from the command line inside the Docker container:

make tests

To run the unit tests (including linting and doctests) from the command line outside the Docker container:

make docker_tests

Pre-Check-in

Unit Tests

The unit tests in this project are all found in the /tests directory.

In the terminal, in the project directory, run the following command to run linting and unit-tests:

make tests

The .env file must contain the following key/values:

  • OPENAI_API_KEY
  • STACK_OVERFLOW_KEY
  • HUGGING_FACE_API_KEY
  • HUGGING_FACE_ENDPOINT_UNIT_TESTS which should contain the endpointurl for a lamma-2 endpoint (or model that expects the same message formatting (e.g. [INST], <<SYS>>, etc.))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_workflow-0.0.22.tar.gz (82.8 kB view details)

Uploaded Source

Built Distribution

llm_workflow-0.0.22-py3-none-any.whl (89.5 kB view details)

Uploaded Python 3

File details

Details for the file llm_workflow-0.0.22.tar.gz.

File metadata

  • Download URL: llm_workflow-0.0.22.tar.gz
  • Upload date:
  • Size: 82.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.9

File hashes

Hashes for llm_workflow-0.0.22.tar.gz
Algorithm Hash digest
SHA256 b43bb945b507054044c662c5391ab7a29907c170c2e2e4b16407546992697935
MD5 a53a5a4fd2e27afa792de9171b96fea4
BLAKE2b-256 65cb15ce9acc55814dc57eeac30c698b94337342867010c29a3851a63a201273

See more details on using hashes here.

File details

Details for the file llm_workflow-0.0.22-py3-none-any.whl.

File metadata

File hashes

Hashes for llm_workflow-0.0.22-py3-none-any.whl
Algorithm Hash digest
SHA256 8d30ddbe5a36e2b76e8fcc5c067dd16140ee18ca26ca517c91dc080d83b7bb26
MD5 537bad0ba9476d0931e0006394909573
BLAKE2b-256 a9471c45c3a99304f372ec300aafec270f25ee9b8db70a7dc5b0109e0f41b3b1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page