Skip to main content

A tool that creates multi-prompt datasets from single-prompt datasets using templates

Project description

PromptSuite Logo

PromptSuite: A Universal Toolkit for Multi-Prompt Evaluation

A tool that creates multi-prompt datasets from single-prompt datasets using templates with variation specifications.

Overview

PromptSuite transforms your single-prompt datasets into rich multi-prompt datasets by applying various types of variations specified in your templates. It supports HuggingFace-compatible datasets and provides both a command-line interface and a modern web UI.

📚 Documentation

Installation

From PyPI (Recommended)

pip install promptsuite

From GitHub (Latest)

pip install git+https://github.com/eliyahabba/PromptSuite.git

From Source

git clone https://github.com/eliyahabba/PromptSuite.git
cd PromptSuite
pip install -e .

Quick Start

Command Line Interface

promptsuite --template '{"instruction": "{instruction}: {text}", "text": ["paraphrase_with_llm"], "gold": "label"}' \
               --data data.csv --max-variations-per-row 50

Streamlit Interface

Launch the modern Streamlit interface for an intuitive experience:

# If installed via pip
promptsuite-ui

# From project root (development)
python src/promptsuite/ui/main.py

# Alternative: using the runner script
python scripts/run_ui.py

The web UI provides:

  • 📁 Step 1: Upload data or use sample datasets
  • 🔧 Step 2: Build templates with smart suggestions
  • Step 3: Generate variations with real-time progress and export results

Python API

from promptsuite import PromptSuite
import pandas as pd

# Initialize
ps = PromptSuite()

# Load data
data = [{"question": "What is 2+2?", "answer": "4"}]
ps.load_dataframe(pd.DataFrame(data))

# Configure template
template = {
  'instruction': 'Please answer the following questions.',
  'prompt format': 'Q: {question}\nA: {answer}',
  'question': ['typos and noise'],
}
ps.set_template(template)

# Generate variations
ps.configure(max_rows=2, variations_per_field=3)
variations = ps.generate(verbose=True)

# Export results
ps.export("output.json", format="json")

📚 Core Concepts

Templates

Templates control how prompts are structured and varied:

Key Description Example
instruction System prompt (optional) {placeholders} 'You are a helpful assistant. Answer the following questions about {subject}.''
prompt format Main template with {placeholders} 'Q: {question}\nA: {answer}'
gold Correct answer field 'answer' or {'field': 'answer', 'type': 'index'}
few_shot Few-shot configuration {'count': 2, 'format': 'shared_ordered_random_n', 'split': 'train'}

Variation Types

Type Description Requires API Key
paraphrase_with_llm AI-powered rephrasing
context Adds background context
format_structure Changes separators, casing, field connectors
typos and noise Injects typos, capitalization changes, spacing, character swaps, and punctuation noise
shuffle Reorders list items
enumerate Adds numbering (1. 2. 3.)

This template demonstrates how to use all the main keys for maximum flexibility and clarity. You can import these keys from promptsuite.core.template_keys to avoid typos and ensure consistency.

Template Format

Templates use Python f-string syntax with custom variation annotations:

"{instruction:semantic}: {few_shot}\n Question: {question:paraphrase_with_llm}\n Options: {options:non-semantic}"

System Prompt

  • instruction: (optional) A general instruction that appears at the top of every prompt, before any few-shot or main question. You can use placeholders (e.g., {subject}) that will be filled from the data for each row.
  • prompt format: The per-example template, usually containing the main question and placeholders for fields.

Supported Variation Types

  • paraphrase_with_llm - Paraphrasing variations (LLM-based)
  • format_structure - Semantic-preserving format changes (e.g., separators, casing, field connectors)
  • typos and noise - Injects typos, capitalization changes, spacing, character swaps, and punctuation noise
  • context - Context-based variations
  • shuffle - Shuffle options/elements (for multiple choice)
  • enumerate - Enumerate list fields (e.g., 1. 2. 3. 4., A. B. C. D., roman numerals, etc.) You can combine these augmenters in your template for richer prompt variations.

Template Format

Templates use a dictionary format with specific keys for different components:

template = {
  "instruction": "You are a helpful assistant. Please answer the following questions.",
  "instruction variations": ["paraphrase_with_llm"],
  "prompt format": "Q: {question}\nOptions: {options}\nA: {answer}",
  "prompt format variations": ["format structure"],
  "question": ["shuffle", "typos and noise"],
  "options": ["enumerate"],
  "gold": {
    'field': 'answer',
    'type': 'index',
    'options_field': 'options'
  },
  "few_shot": {
    'count': 2,
    'format': 'shared_ordered_random_n',
    'split': 'train'
  }
}

API Reference

PromptSuite Class

class PromptSuite:
    def __init__(self):
        """Initialize PromptSuite."""
        
    def load_dataframe(self, df: pd.DataFrame) -> None:
        """Load data from pandas DataFrame."""
        
    def load_csv(self, filepath: str, **kwargs) -> None:
        """Load data from CSV file."""
        
    def load_dataset(self, dataset_name: str, split: str = "train", **kwargs) -> None:
        """Load data from HuggingFace datasets."""
        
    def set_template(self, template_dict: Dict[str, Any]) -> None:
        """Set template configuration."""
        
    def configure(self, **kwargs) -> None:
        """Configure generation parameters."""
        
    def generate(self, verbose: bool = False) -> List[Dict[str, Any]]:
        """Generate prompt variations."""
        
    def export(self, filepath: str, format: str = "json") -> None:
        """Export variations to file."""

Examples

Sentiment Analysis

import pandas as pd
from promptsuite import PromptSuite

data = pd.DataFrame({
  'text': ['I love this movie!', 'This book is terrible.'],
  'label': ['positive', 'negative']
})

template = {
  'instruction': 'Classify the sentiment',
  'instruction_variations': ['paraphrase_with_llm'],
  'prompt format': f"Text: {text}\nSentiment: {label}",
  'text': ['typos and noise'],
}

ps = PromptSuite()
ps.load_dataframe(data)
ps.set_template(template)
ps.configure(
  variations_per_field=3,
  max_variations_per_row=2,
  random_seed=42,
  api_platform="TogetherAI",
  model_name="meta-llama/Llama-3.3-70B-Instruct-Turbo-Free"
)
variations = ps.generate(verbose=True)

Question Answering with Few-shot

template = {
  'instruction': 'Answer the question:\nQuestion: {question}\nAnswer: {answer}',
  'instruction_variations': ['paraphrase_with_llm'],
  'question': ['semantic'],
  'gold': 'answer',
  'few_shot': {
    'count': 2,
    'format': 'shared_ordered_random_n',
    'split': 'train'
  }
}

ps = PromptSuite()
ps.load_dataframe(qa_data)
ps.set_template(template)
ps.configure(
  variations_per_field=2,
  api_platform="TogetherAI",
  model_name="meta-llama/Llama-3.3-70B-Instruct-Turbo-Free"
)
variations = ps.generate(verbose=True)

Multiple Choice with Few-shot

import pandas as pd
from promptsuite import PromptSuite

data = pd.DataFrame({
    'question': [
        'What is the largest planet in our solar system?',
        'Which chemical element has the symbol O?',
        'What is the fastest land animal?',
        'What is the smallest prime number?',
        'Which continent is known as the "Dark Continent"?'
    ],
    'options': [
        'Earth, Jupiter, Mars, Venus',
        'Oxygen, Gold, Silver, Iron',
        'Lion, Cheetah, Horse, Leopard',
        '1, 2, 3, 0',
        'Asia, Africa, Europe, Australia'
    ],
    'answer': [1, 0, 1, 1, 1],
    'subject': ['Astronomy', 'Chemistry', 'Biology', 'Mathematics', 'Geography']
})

template = {
    'prompt format': 'Question: {question}\nOptions: {options}\nAnswer:',
    'prompt format variations': ['format structure'],
    'options': ['shuffle', 'enumerate'],
    'gold': {
        'field': 'answer',
        'type': 'index',
        'options_field': 'options'
    },
    'few_shot': {
        'count': 2,
        'format': 'shared_ordered_random_n',
        'split': 'train'
    }
}

ps = PromptSuite()
ps.load_dataframe(data)
ps.set_template(template)
ps.configure(max_rows=5, variations_per_field=1)
variations = ps.generate(verbose=True)
for v in variations:
    print(v['prompt'])

Example Output Format

A typical output from ps.generate() or the exported JSON file looks like this (for a multiple choice template):

[
  {
    "prompt": "Answer the following multiple choice question:\nQuestion: What is 2+2?\nOptions: 3, 4, 5, 6\nAnswer:",
    "original_row_index": 1,
    "variation_count": 1,
    "template_config": {
      "instruction": "Answer the following multiple choice question:\nQuestion: {question}\nOptions: {options}\nAnswer: {answer}",
      "options": ["shuffle"],
      "gold": {
        "field": "answer",
        "type": "index",
        "options_field": "options"
      },
      "few_shot": {
        "count": 1,
        "format": "shared_ordered_random_n",
        "split": "train"
      }
    },
    "field_values": {
      "options": "3, 4, 5, 6"
    },
    "gold_updates": {
      "answer": "1"
    },
    "conversation": [
      {
        "role": "user",
        "content": "Answer the following multiple choice question:\nQuestion: What is 2+2?\nOptions: 3, 4, 5, 6\nAnswer:"
      },
      {
        "role": "assistant",
        "content": "1"
      },
      {
        "role": "user",
        "content": "Answer the following multiple choice question:\nQuestion: What is the capital of France?\nOptions: London, Berlin, Paris, Madrid\nAnswer:"
      }
    ]
  }
]

📖 Detailed Guide

Data Loading

# CSV
ps.load_csv('data.csv')

# JSON
ps.load_json('data.json')

# HuggingFace
ps.load_dataset('squad', split='train[:100]')

# DataFrame
ps.load_dataframe(df)

Generation Options

ps.configure(
    max_rows=10,                    # How many data rows to use
    variations_per_field=3,         # Variations per field (default: 3)
    max_variations_per_row=50,      # Cap on total variations per row
    random_seed=42,                 # For reproducibility
    api_platform="TogetherAI",      # or "OpenAI"
    model_name="meta-llama/Llama-3.3-70B-Instruct-Turbo-Free"
)

Export Formats

# JSON - Full data with metadata
ps.export("output.json", format="json")

# CSV - Flattened for spreadsheets
ps.export("output.csv", format="csv")

# TXT - Plain prompts only
ps.export("output.txt", format="txt")

Web UI Interface

PromptSuite 2.0 includes a modern, interactive web interface built with Streamlit.

The UI guides you through a simple 3-step workflow:

  1. Upload Data: Load your dataset (CSV/JSON) or use built-in samples. Preview and validate your data before continuing.
  2. Build Template: Create or select a prompt template, with smart suggestions based on your data. See a live preview of your template.
  3. Generate & Export: Configure generation settings, run the variation process, and export your results in various formats.

The Streamlit UI is the easiest way to explore, test, and generate prompt variations visually.

🔧 Advanced Features

Performance Optimization

PromptSuite automatically optimizes performance by pre-generating variations for shared fields:

  • Instruction variations (instruction variations) are generated once and reused across all data rows
  • Prompt format variations (prompt format variations) are generated once and reused across all data rows

This optimization is especially important for LLM-based augmenters like paraphrase_with_llm that would otherwise run the same API calls repeatedly for identical text.

Gold Field Configuration

Simple format (for text answers):

'gold': 'answer'  # Just the column name

Advanced format (for index-based answers):

'gold': {
    'field': 'answer',
    'type': 'index',        # Answer is an index
    'options_field': 'options'  # Column with the options
}

Few-Shot Configuration

Few-shot examples can be configured with different sampling strategies:

Format Description Use Case
shared_ordered_first_n Always uses the first N examples from available data (deterministic, shared for all rows) When you want consistent, predictable examples
shared_ordered_random_n Always uses the same N random examples (with fixed seed, shared for all rows) When you want random but consistent examples across all rows
shared_unordered_random_n Always uses the same N random examples but shuffles their order for each row When you want consistent examples but varied order to reduce position bias
random_per_row Randomly samples different examples for each row (using row index as seed) When you want variety and different examples per question

Example:

"few_shot": {
    "count": 2,                    # Number of examples to use
    "format": "shared_ordered_random_n",   # Sampling strategy
    "split": "train"               # Use only training data for examples
}

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

promptsuite-3.0.2.tar.gz (1.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

promptsuite-3.0.2-py3-none-any.whl (120.4 kB view details)

Uploaded Python 3

File details

Details for the file promptsuite-3.0.2.tar.gz.

File metadata

  • Download URL: promptsuite-3.0.2.tar.gz
  • Upload date:
  • Size: 1.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for promptsuite-3.0.2.tar.gz
Algorithm Hash digest
SHA256 bc6cc00fce41b84baf1361f38523ca0f2f90b5fcf8cad11fd081e343aa792690
MD5 e70040d39e5f4799a17707a0659c3958
BLAKE2b-256 f0d73d0e6301d9c0d9c225f51787c3a33e263b3ffa9de49f789daefaedc93fb5

See more details on using hashes here.

Provenance

The following attestation bundles were made for promptsuite-3.0.2.tar.gz:

Publisher: publish.yml on eliyahabba/PromptSuite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file promptsuite-3.0.2-py3-none-any.whl.

File metadata

  • Download URL: promptsuite-3.0.2-py3-none-any.whl
  • Upload date:
  • Size: 120.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for promptsuite-3.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 cf8ad5dbf65121b41feeedc9fbd3f8639a914c59db63bda26dd08400e46ce6c0
MD5 c48e1010d3fbb41855dd9e3b4f3c0787
BLAKE2b-256 621ac09bc59741e1a34fdc6975e8eca61e12953796bc6913804904d97c84cef3

See more details on using hashes here.

Provenance

The following attestation bundles were made for promptsuite-3.0.2-py3-none-any.whl:

Publisher: publish.yml on eliyahabba/PromptSuite

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page