Package for generating Prompt Stability Score (PSS). PSS estimates the stability of outcomes resulting from variations in language model prompt specifications.
Project description
promptstability
Package for generating Prompt Stability Scores (PSS). See paper here outlining technique for investigating the stability of outcomes resulting from variations in language model prompt specifications. Replication material here.
Table of Contents
Requirements
- Python 3.8 to 3.10 (Python 3.11 and above are not supported due to dependency limitations)
- Other dependencies are installed automatically via
pip
Installation
Install this library using pip:
pip install promptstability
Example Usage
Here we provide instructions for using promptstability with OpenAI and Ollama.
import pandas as pd
from promptstability.core import get_api_key
from promptstability.core import PromptStabilityAnalysis
from promptstability.core import load_example_data
import os
# Load data (news articles)
df = load_example_data()
print(df.head())
example_data = list(df['body'].values) # Take a subsample
# Define the prompt texts
original_text = 'The following are some news articles about the economy.'
prompt_postfix = 'Respond 0 for positive news, or 1 for negative news. Guess if you do not know. Respond nothing else.'
a) OpenAI Example (e.g., GPT-4o-mini)
from openai import OpenAI
# Initialize OpenAI client
# First set the OPENAI_API_KEY environment variable
APIKEY = get_api_key('openai')
client = OpenAI(api_key=APIKEY)
OPENAI_MODEL = 'gpt-4o-mini'
# Define the OpenAI annotation function
def annotate_openai(text, prompt, temperature=0.1):
try:
response = client.chat.completions.create(
model=OPENAI_MODEL,
temperature=temperature,
messages=[
{"role": "system", "content": prompt},
{"role": "user", "content": text}
]
)
except Exception as e:
print(f"OpenAI exception: {e}")
raise e
return ''.join(choice.message.content for choice in response.choices)
# Instantiate the analysis class using OpenAI’s annotation function (Note on warnings: Pegasus comes with automated warning about model weights, which you can ignore)
psa_openai = PromptStabilityAnalysis(annotation_function=annotate_openai, data=example_data)
# Run intra-prompt stability analysis using the method `intra_pss`
print("Running OpenAI intra-prompt analysis...")
ka_openai_intra, annotated_openai_intra = psa_openai.intra_pss(
original_text,
prompt_postfix,
iterations=5, # minimal iterations
plot=True,
save_path='news_intra.png',
save_csv="news_intra.csv"
)
print("OpenAI intra-prompt KA scores:", ka_openai_intra)
# Run inter-prompt stability analysis using the method `inter_pss`
print("Running OpenAI inter-prompt analysis...")
temperatures = [0.1, 0.5, 2.0] # in practice, you would set more temperatures than this
ka_openai_inter, annotated_openai_inter = psa_openai.inter_pss(
original_text,
prompt_postfix,
nr_variations=3,
temperatures=temperatures,
iterations=1,
plot=True,
save_path='news_inter.png',
save_csv="news_inter.csv"
)
print("OpenAI inter-prompt KA scores:", ka_openai_inter)
b) Ollama Example (e.g., your local deepseek-r1:8b)
import ollama
# Make sure that your Ollama server is running locally and that 'deepseek-r1:8b' is available.
OLLAMA_MODEL = 'deepseek-r1:8b'
# Define the Ollama annotation function
def annotate_ollama(text, prompt, temperature=0.1):
try:
response = ollama.chat(model=OLLAMA_MODEL, messages=[
{"role": "system", "content": prompt},
{"role": "user", "content": text}
])
except Exception as e:
print(f"Ollama exception: {e}")
raise e
return response['message']['content']
# Instantiate the analysis class using Ollama’s annotation function (Note on warnings: Pegasus comes with automated warning about model weights, which you can ignore)
psa_ollama = PromptStabilityAnalysis(annotation_function=annotate_ollama, data=example_data)
# Run intra-prompt stability analysis using the method `intra_pss`
print("Running Ollama intra-prompt analysis...")
ka_ollama_intra, annotated_ollama_intra = psa_ollama.intra_pss(
original_text,
prompt_postfix,
iterations=5,
plot=False
)
print("Ollama intra-prompt KA scores:", ka_ollama_intra)
# Run inter-prompt stability analysis using the method `inter_pss`
temperatures = [0.1, 2.0, 5.0] # or whichever temperatures you want to test
print("Running Ollama inter-prompt analysis...")
ka_ollama_inter, annotated_ollama_inter = psa_ollama.inter_pss(
original_text,
prompt_postfix,
nr_variations=3,
temperatures=temperatures,
iterations=1,
plot=False
)
print("Ollama inter-prompt KA scores:", ka_ollama_inter)
API Documentation
Our full API reference documentation is hosted on Read the Docs and includes detailed information on all modules, classes, and functions.
You can access the documentation here:
PromptStability API Documentation
This documentation is automatically updated whenever changes are pushed to the repository.
Development
To contribute to this library, first checkout the code. Then create a new virtual environment:
cd promptstability
python -m venv venv
source venv/bin/activate
Now install the dependencies and test dependencies:
pip install -e '.[test]'
To run the tests:
pytest
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file promptstability-0.1.4.tar.gz.
File metadata
- Download URL: promptstability-0.1.4.tar.gz
- Upload date:
- Size: 41.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c38b79fac72c3d605fed1822ba79cb1cd9bb9fc9796a873426c5323bd4ca1bb3
|
|
| MD5 |
a7b8c5c8ba2c93688ef83c45663a5f06
|
|
| BLAKE2b-256 |
528d39fd8f5cd94b8b9a29c470c24a5b212306b7c44ad10b9f86bde6a9a1fa3a
|
File details
Details for the file promptstability-0.1.4-py3-none-any.whl.
File metadata
- Download URL: promptstability-0.1.4-py3-none-any.whl
- Upload date:
- Size: 40.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bda8973ed381b81f780b842d1358cf4f5521103eade85bd43c987907a115d9ee
|
|
| MD5 |
9d1ddc3265d7c62274b11fe6257703b3
|
|
| BLAKE2b-256 |
1a45d4e8ee1778b8788c7dad779dc8db4b1ffc9ad26ce4234a3d7d7070ea9e9a
|