Skip to main content

Tools for LLM prompt testing and experimentation

Project description


:wrench: Test and experiment with prompts, LLMs, and vector databases. :hammer:

Total Downloads

Welcome to prompttools created by Hegel AI! This repo offers a set of open-source, self-hostable tools for experimenting with, testing, and evaluating LLMs, vector databases, and prompts. The core idea is to enable developers to evaluate using familiar interfaces like code, notebooks, and a local playground.

In just a few lines of code, you can test your prompts and parameters across different models (whether you are using OpenAI, Anthropic, or LLaMA models). You can even evaluate the retrieval accuracy of vector databases.

from prompttools.experiment import OpenAIChatExperiment

messages = [
    [{"role": "user", "content": "Tell me a joke."},],
    [{"role": "user", "content": "Is 17077 a prime number?"},],

models = ["gpt-3.5-turbo", "gpt-4"]
temperatures = [0.0]
openai_experiment = OpenAIChatExperiment(models, messages, temperature=temperatures)


To stay in touch with us about issues and future updates, join the Discord.


To install prompttools, you can use pip:

pip install prompttools

You can run a simple example of a prompttools locally with the following

git clone
cd prompttools && jupyter notebook examples/notebooks/OpenAIChatExperiment.ipynb

You can also run the notebook in Google Colab


If you want to interact with prompttools using our playground interface, you can launch it with the following commands.

First, install prompttools:

pip install prompttools

Then, clone the git repo and launch the streamlit app:

git clone
cd prompttools && streamlit run prompttools/playground/

You can also access a hosted version of the playground on the Streamlit Community Cloud.

Note: The hosted version does not support LlamaCpp


Our documentation website contains the full API reference and more description of individual components. Check it out!

Supported Integrations

Here is a list of APIs that we support with our experiments:


  • OpenAI (Completion, ChatCompletion, Fine-tuned models) - Supported
  • LLaMA.Cpp (LLaMA 1, LLaMA 2) - Supported
  • HuggingFace (Hub API, Inference Endpoints) - Supported
  • Anthropic - Supported
  • Mistral AI - Supported
  • Google Gemini - Supported
  • Google PaLM (legacy) - Supported
  • Google Vertex AI - Supported
  • Azure OpenAI Service - Supported
  • Replicate - Supported
  • Ollama - In Progress

Vector Databases and Data Utility

  • Chroma - Supported
  • Weaviate - Supported
  • Qdrant - Supported
  • LanceDB - Supported
  • Milvus - Exploratory
  • Pinecone - Supported
  • Epsilla - In Progress


  • LangChain - Supported
  • MindsDB - Supported
  • LlamaIndex - Exploratory

Computer Vision

  • Stable Diffusion - Supported
  • Replicate's hosted Stable Diffusion - Supported

If you have any API that you'd like to see being supported soon, please open an issue or a PR to add it. Feel free to discuss in our Discord channel as well.

Frequently Asked Questions (FAQs)

  1. Will this library forward my LLM calls to a server before sending it to OpenAI, Anthropic, and etc.?

    • No, the source code will be executed on your machine. Any call to LLM APIs will be directly executed from your machine without any forwarding.
  2. Does prompttools store my API keys or LLM inputs and outputs to a server?

    • No, all of those data stay on your local machine. We do not collect any PII (personally identifiable information).
  3. How do I persist my results?

    • To persist the results of your tests and experiments, you can export your Experiment with the methods to_csv, to_json, to_lora_json, or to_mongo_db. We are building more persistence features and we will be happy to further discuss your use cases, pain points, and what export options may be useful for you.


Usage Tracking

Since we are changing our API rapidly, there are some errors caused by our negligence or out of date documentation. To improve user experience, we collect data from normal package usage that helps us understand the errors that are raised. This data is collected and sent to Sentry, a third-party error tracking service, commonly used in open-source softwares. It only logs this library's own actions.

You can easily opt-out by defining an environment variable called SENTRY_OPT_OUT.


We welcome PRs and suggestions! Don't hesitate to open a PR/issue or to reach out to us via email. Please have a look at our contribution guide and "Help Wanted" issues to get started!

Usage and Feedback

We will be delighted to work with early adopters to shape our designs. Please reach out to us via email if you're interested in using this tooling for your project or have any feedback.


We will be gradually releasing more components to the open-source community. The current license can be found in the LICENSE file. If there is any concern, please contact us and we will be happy to work with you.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prompttools-0.0.46.tar.gz (80.3 kB view hashes)

Uploaded Source

Built Distribution

prompttools-0.0.46-py3-none-any.whl (120.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page