Tools for LLM prompt testing and experimentation

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

PromptTools

Welcome to prompttools created by Hegel AI! This repo offers a set of free, open-source tools for testing and experimenting with prompts. The core idea is to enable developers to evaluate prompts using familiar interfaces like code and notebooks.

Quickstart

To install prompttools, you can use pip:

pip install prompttools

You can run a simple example of a prompttools with the following

DEBUG=1 python examples/prompttests/example.py

To run the example outside of DEBUG mode, you'll need to bring your own OpenAI API key. This is because prompttools makes a call to OpenAI from your machine. For example:

OPENAI_API_KEY=sk-... python examples/prompttests/example.py

You can see the full example here.

Using `prompttools`

There are primarily two ways you can use prompttools in your LLM workflow:

Run experiments in notebooks.
Write unit tests and integrate them into your CI/CD workflow via Github Actions.

Notebooks

There are a few different ways to run an experiment in a notebook.

The simplest way is to define an experimentation harness and an evaluation function:

def eval_fn(prompt: str, results: Dict, metadata: Dict) -> float:
    # Your logic here, or use a built-in one such as `prompttools.utils.similarity`.
    pass

prompt_templates = [
    "Answer the following question: {{input}}", 
    "Respond the following query: {{input}}"
]

user_inputs = [
    {"input": "Who was the first president?"}, 
    {"input": "Who was the first president of India?"}
]

harness = PromptTemplateExperimentationHarness("text-davinci-003", 
                                               prompt_templates, 
                                               user_inputs)


harness.run()
harness.evaluate("metric_name", eval_fn)
harness.visualize()  # The results will be displayed as a table in your notebook

If you are interested to compare different models, the ModelComparison example may be of interest.

For an example of built-in evaluation function, please see this example of semantic similarity comparison for details.

You can also manually enter feedback to evaluate prompts, see HumanFeedback.ipynb.

Note: Above we used an ExperimentationHarness. Under the hood, that harness uses an Experiment to construct and make API calls to LLMs. The harness is responsible for managing higher level abstractions, like prompt templates or system prompts. To see how experiments work at a low level, see this example.

Unit Tests

Unit tests in prompttools are called prompttests. They use the @prompttest annotation to transform an evaluation function into an efficient unit test. The prompttest framework executes and evaluates experiments so you can test prompts over time. You can see an example test here and an example of that test being used as a Github Action here.

Persisting Results

To persist the results of your tests and experiments, one option is to enable HegelScribe (also developed by us at Hegel AI). It logs all the inferences from your LLM, along with metadata and custom metrics, for you to view on your private dashboard. We have a few early adopters right now, and we can further discuss your use cases, pain points, and how it may be useful for you.

Installation

To install prompttools using pip:

pip install prompttools

To install from source, first clone this GitHub repo to your local machine, then, from the repo, run:

pip install .

You can then proceed to run our examples.

Frequently Asked Questions (FAQs)

Will this library forward my LLM calls to a server before sending it to OpenAI/Anthropic/etc?
- No, the source code will be executed on your machine. Any call to LLM APIs will be directly executed from your machine without any forwarding.

Contributing

We welcome PRs and suggestions! Don't hesitate to open a PR/issue or to reach out to us via email.

Usage and Feedback

We will be delighted to work with early adopters to shape our designs. Please reach out to us via email if you're interested in using this tooling for your project or have any feedback.

License

We will be gradually releasing more components to the open-source community. The current license can be found in the LICENSE file. If there is any concern, please contact us and we will be happy to work with you.

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.0.46

Mar 15, 2024

0.0.45

Dec 27, 2023

0.0.44

Dec 22, 2023

0.0.43

Nov 9, 2023

0.0.41

Nov 3, 2023

0.0.40

Nov 3, 2023

0.0.39

Nov 3, 2023

0.0.38

Oct 26, 2023

0.0.37

Oct 21, 2023

0.0.36

Oct 5, 2023

0.0.35

Oct 3, 2023

0.0.34

Aug 29, 2023

0.0.33

Aug 17, 2023

0.0.32

Aug 8, 2023

0.0.31

Aug 7, 2023

0.0.30

Aug 4, 2023

0.0.29

Aug 4, 2023

0.0.28

Aug 4, 2023

0.0.27

Aug 4, 2023

0.0.26

Aug 4, 2023

0.0.25

Aug 4, 2023

0.0.24

Aug 2, 2023

0.0.23

Aug 2, 2023

0.0.22

Aug 1, 2023

0.0.21

Aug 1, 2023

0.0.20

Aug 1, 2023

0.0.19

Jul 31, 2023

0.0.18

Jul 31, 2023

0.0.17

Jul 31, 2023

0.0.16

Jul 27, 2023

0.0.15

Jul 27, 2023

0.0.14

Jul 19, 2023

0.0.13

Jul 18, 2023

0.0.12

Jul 18, 2023

0.0.11

Jul 9, 2023

0.0.10

Jul 9, 2023

0.0.9

Jul 9, 2023

0.0.8

Jul 9, 2023

This version

0.0.7

Jul 8, 2023

0.0.6

Jul 8, 2023

0.0.5

Jul 6, 2023

0.0.4

Jul 6, 2023

0.0.3

Jul 6, 2023

0.0.2

Jul 6, 2023

0.0.1

Jun 25, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prompttools-0.0.7.tar.gz (21.9 kB view hashes)

Uploaded Jul 8, 2023 Source

Built Distribution

prompttools-0.0.7-py3-none-any.whl (30.5 kB view hashes)

Uploaded Jul 8, 2023 Python 3

Hashes for prompttools-0.0.7.tar.gz

Hashes for prompttools-0.0.7.tar.gz
Algorithm	Hash digest
SHA256	`55dbffeefd5a3f84e8b60d0b8c9e32b76874aebe7d11cc437da7f565dda2d98c`
MD5	`79df11fbdf5e630540622f02900f8e46`
BLAKE2b-256	`bebe472231a5b20b4379a6587f610164393e40b6c0b8092eaf1362cd35711a60`

Hashes for prompttools-0.0.7-py3-none-any.whl

Hashes for prompttools-0.0.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5d2e1519c0421673c90026adfd329ccbc7b8b5b202919634509bb2cd03f0756a`
MD5	`42ec59e5f82bcd85b60bdcbf35f67f36`
BLAKE2b-256	`ca2df4b62018bec7498580a6d747b7494e5b6322f21c6ebd62261080ff364714`

prompttools 0.0.7

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

PromptTools

Quickstart

Using `prompttools`

Notebooks

Unit Tests

Persisting Results

Installation

Frequently Asked Questions (FAQs)

Contributing

Usage and Feedback

License

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

prompttools 0.0.7

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

PromptTools

Quickstart

Using prompttools

Notebooks

Unit Tests

Persisting Results

Installation

Frequently Asked Questions (FAQs)

Contributing

Usage and Feedback

License

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

Using `prompttools`