Skip to main content

Persona-aligned evaluation toolkit for auditing conversational AI authenticity, safety, and stability.

Project description

Alignmenter

Persona-aligned evaluation for conversational AI

๐Ÿ“š Documentation โ€ข Quickstart โ€ข Quick Start Guide โ€ข CLI Reference โ€ข License


Overview

Alignmenter is a lightweight, production-ready evaluation toolkit for auditing conversational AI systems across three core dimensions:

  • ๐ŸŽจ Authenticity: Does the assistant stay on-brand?
  • ๐Ÿ›ก๏ธ Safety: Does it avoid harmful or policy-violating outputs?
  • โš–๏ธ Stability: Are responses consistent across sessions?

Unlike generic LLM evaluation frameworks, Alignmenter is purpose-built for persona alignmentโ€”ensuring your AI assistant speaks with your unique voice while staying safe and stable.

Quickstart

Installation

# Clone the repository
git clone https://github.com/justinGrosvenor/alignmenter.git
cd alignmenter

# Create virtual environment
python -m venv env
source env/bin/activate  # On Windows: env\Scripts\activate

# Install with dev + safety extras
pip install -e ./alignmenter[dev,safety]

Install from PyPI

pip install "alignmenter[safety]"
alignmenter init
alignmenter run --config configs/run.yaml --embedding sentence-transformer:all-MiniLM-L6-v2

Note: The safety extra includes transformers for the offline safety classifier (ProtectAI/distilled-safety-roberta). Without it, Alignmenter falls back to a lightweight heuristic classifier. See docs/offline_safety.md for details.

Run Your First Evaluation

# Set API key (for embedding and judge models)
export OPENAI_API_KEY="your-key-here"

# Run evaluation on demo dataset (regenerates transcripts via the provider)
alignmenter run \
  --model openai:gpt-4o-mini \
  --dataset datasets/demo_conversations.jsonl \
  --persona configs/persona/default.yaml

# Reuse existing transcripts without hitting the provider
alignmenter run --config configs/run.yaml

# View interactive HTML report
alignmenter report --last

# Sanitize a dataset in-place or to a new file
alignmenter dataset sanitize datasets/demo_conversations.jsonl --out datasets/demo_sanitized.jsonl

# Generate fresh transcripts (requires provider access)
alignmenter run --config configs/run.yaml --generate-transcripts

Output:

Loading dataset: 60 turns across 10 sessions
โœ“ Brand voice score: 0.82 (range: 0.78-0.86)
โœ“ Safety score: 0.97
โœ“ Consistency score: 0.94
Report written to: reports/demo/2025-11-03T00-14-01_alignmenter_run/index.html

Features

๐ŸŽฏ Three-Dimensional Scoring

Authenticity

  • Embedding similarity: Measures semantic alignment with persona examples
  • Trait model: Logistic regression on linguistic features (trained via calibration)
  • Lexicon matching: Enforces preferred/avoided vocabulary
  • Bootstrap CI: Statistical confidence intervals for reliability

Safety

  • Keyword classifier: Fast pattern matching for common violations
  • LLM judge: GPT-4 as a safety oracle with budget controls
  • Offline classifier: ProtectAI's distilled-safety-roberta (no API calls)
  • Fused scoring: Weighted ensemble of rule-based + model-based signals
  • Adversarial testing: Built-in safety traps in demo datasets

Stability

  • Cosine variance: Detects semantic drift across conversation turns
  • Session clustering: Identifies divergent response patterns
  • Temporal analysis: Tracks consistency over time

๐Ÿ“Š Rich Reporting

  • Interactive HTML: Grade-based report cards with charts (Chart.js)
  • JSON export: Machine-readable results for CI/CD pipelines
  • CSV downloads: Per-metric exports for spreadsheet analysis
  • Turn-level explorer: Drill down into individual responses

๐Ÿ”ง Production-Ready

  • Multi-provider support: OpenAI, Anthropic, local (vLLM, Ollama)
  • Budget guardrails: Halt runs at 90% of judge API budget
  • Cost projection: Estimate expenses before execution
  • Reproducibility: Logs Python version, model, seed, timestamps
  • PII sanitization: Built-in scrubbing for production data

๐Ÿš€ Developer Experience

  • CLI-first: Simple commands for evaluation, calibration, reporting
  • YAML configuration: Declarative persona packs and run configs
  • Python API: Programmatic access for custom workflows
  • Comprehensive tests: 69+ unit tests with pytest
  • Type safety: Full type hints throughout

Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                        Alignmenter CLI                          โ”‚
โ”‚  alignmenter run / report / calibrate / bootstrap / sanitize    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
                               โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                          Runner                                 โ”‚
โ”‚  Orchestrates evaluation: load data โ†’ score โ†’ report            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
           โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
           โ–ผ                   โ–ผ                   โ–ผ
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚ Authenticity โ”‚    โ”‚    Safety    โ”‚    โ”‚  Stability   โ”‚
   โ”‚    Scorer    โ”‚    โ”‚    Scorer    โ”‚    โ”‚    Scorer    โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
           โ”‚                   โ”‚                   โ”‚
           โ”‚                   โ”‚                   โ”‚
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚  Embeddings  โ”‚    โ”‚  LLM Judge   โ”‚    โ”‚   Cosine     โ”‚
   โ”‚  Trait Model โ”‚    โ”‚  Keywords    โ”‚    โ”‚  Variance    โ”‚
   โ”‚   Lexicon    โ”‚    โ”‚  Fusion      โ”‚    โ”‚  Clustering  โ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
                               โ–ผ
           โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
           โ”‚         Reporting Layer               โ”‚
           โ”‚  HTML / JSON / CSV / Interactive UI   โ”‚
           โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Key Components

Component Purpose Key Files
CLI Command-line interface src/alignmenter/cli.py
Runner Orchestration engine src/alignmenter/runner.py
Scorers Metric computation src/alignmenter/scorers/
Providers LLM/embedding backends src/alignmenter/providers/
Reporters Output generation src/alignmenter/reporting/
Datasets JSONL conversation data datasets/
Personas Brand voice definitions configs/persona/

๐Ÿ“š Documentation

Full documentation available at docs.alignmenter.com

Quick links:


Case Studies

  • Wendy's Twitter Voice - End-to-end calibration example using the included case-study assets. (Available when running from the source repo; not included in the PyPI wheel.)

Usage Examples

Evaluate Multiple Models

# Compare GPT-4 vs Claude
alignmenter run \
  --model openai:gpt-4 \
  --compare anthropic:claude-3-5-sonnet-20241022 \
  --dataset datasets/demo_conversations.jsonl \
  --persona configs/persona/default.yaml

Custom Judge and Embeddings

# Use Claude as safety judge, local embeddings
alignmenter run \
  --model openai:gpt-4o-mini \
  --judge anthropic:claude-3-5-sonnet-20241022 \
  --embedding sentence-transformer:all-MiniLM-L6-v2 \
  --dataset datasets/demo_conversations.jsonl \
  --persona configs/persona/default.yaml

Bootstrap Synthetic Dataset

# Generate 50 conversations with adversarial traps
alignmenter bootstrap-dataset \
  --out datasets/my_test.jsonl \
  --sessions 50 \
  --safety-trap-ratio 0.15 \
  --brand-trap-ratio 0.20 \
  --seed 42

Calibrate Persona Traits

# Train trait model from labeled data
alignmenter calibrate-persona \
  --persona-path configs/persona/mybot.yaml \
  --dataset annotations.jsonl \
  --out configs/persona/mybot.traits.json \
  --epochs 300

Sanitize Production Data

# Remove PII before evaluation
alignmenter dataset sanitize prod_logs.jsonl \
  --out datasets/sanitized.jsonl \
  --no-use-hashing

Persona Configuration

Define your brand voice in YAML:

# configs/persona/mybot.yaml
id: mybot
name: "MyBot Assistant"
description: "Professional, evidence-driven, technical"

voice:
  tone: ["professional", "precise", "measured"]
  formality: "business_casual"

  # Preferred vocabulary
  lexicon:
    preferred:
      - "baseline"
      - "signal"
      - "alignment"
      - "evidence-based"
    avoided:
      - "lol"
      - "bro"
      - "hype"
      - "vibes"

# Example on-brand responses (for embedding similarity)
examples:
  - "Our baseline analysis indicates a 15% improvement in alignment metrics."
  - "The signal-to-noise ratio suggests this approach is viable."
  - "Let's establish a clear baseline before proceeding."

# Trait model weights (generated by calibration)
traits:
  weights: [0.12, -0.34, 0.08, ...]  # Learned from annotations
  vocabulary: ["baseline", "signal", ...]

API Usage

from alignmenter.runner import Runner
from alignmenter.config import RunConfig

# Load configuration
config = RunConfig.from_yaml("configs/run/my_eval.yaml")

# Execute evaluation
runner = Runner(config)
results = runner.execute()

# Access scores
print(f"Authenticity: {results['scores']['authenticity']['mean']:.3f}")
print(f"Safety: {results['scores']['safety']['fused_judge']:.3f}")
print(f"Stability: {results['scores']['stability']['session_variance']:.3f}")

# Generate reports
from alignmenter.reporting import HTMLReporter, JSONReporter

html_reporter = HTMLReporter()
html_reporter.write(
    run_dir=results["run_dir"],
    summary=results["summary"],
    scores=results["scores"],
    sessions=results["sessions"],
)

CI/CD Integration

# .github/workflows/eval.yml
name: Persona Evaluation

on: [push, pull_request]

jobs:
  evaluate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
        with:
          python-version: '3.11'

      - name: Install Alignmenter
        run: pip install alignmenter

      - name: Run Evaluation
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
        run: |
          alignmenter run \
            --model openai:gpt-4o-mini \
            --dataset datasets/ci_test.jsonl \
            --persona configs/persona/default.yaml \
            --judge-budget 100

      - name: Upload Report
        uses: actions/upload-artifact@v3
        with:
          name: evaluation-report
          path: reports/

Development

Running Tests

# All tests
pytest

# With coverage
pytest --cov=src/alignmenter --cov-report=html

# Specific test file
pytest tests/test_scorers.py -v

Code Quality

# Type checking
mypy src/

# Linting
ruff check src/

# Formatting
black src/ tests/

Local Development

# Install in editable mode with dev dependencies
pip install -e .[dev]

# Run from source
python -m alignmenter.cli run --help

# Generate report from last run
make report-last

Roadmap

Completed โœ…

  • Three-dimensional scoring (authenticity, safety, stability)
  • Multi-provider support (OpenAI, Anthropic, local models)
  • HTML report cards with interactive charts
  • Offline safety classifier (distilled-safety-roberta)
  • LLM judges for qualitative analysis
  • Budget guardrails and cost tracking
  • PII sanitization tools
  • Calibration workflow and diagnostics

In Progress ๐Ÿšง

  • Multi-language support (non-English personas)
  • Batch processing optimizations
  • Additional embedding providers

Future Considerations ๐Ÿ’ญ

  • Synthetic test case generation
  • Custom metric plugins
  • Advanced trait models (neural networks)

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Areas we'd love help with:

  • Additional persona packs (different brand voices)
  • Language support beyond English
  • Integration with other LLM providers
  • Performance optimizations for large datasets

License

Apache License 2.0 - see LICENSE for details.

Citation

If you use Alignmenter in research, please cite:

@software{alignmenter2024,
  title={Alignmenter: A Framework for Persona-Aligned Conversational AI Evaluation},
  author={Alignmenter Contributors},
  year={2025},
  url={https://github.com/justinGrosvenor/alignmenter},
  license={Apache-2.0}
}

Support


Made with โค๏ธ by the Alignmenter team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alignmenter-0.1.2.tar.gz (122.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

alignmenter-0.1.2-py3-none-any.whl (117.7 kB view details)

Uploaded Python 3

File details

Details for the file alignmenter-0.1.2.tar.gz.

File metadata

  • Download URL: alignmenter-0.1.2.tar.gz
  • Upload date:
  • Size: 122.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for alignmenter-0.1.2.tar.gz
Algorithm Hash digest
SHA256 02fab2022e412f8b0aaec0627207577aa645263a295d8b7579c92c81b238e2e8
MD5 dba4872e447f310466768bc65f36999f
BLAKE2b-256 4f36b6849502beb268fb1f50edbbb57f5926c3d6acd2cf2d747067a38ead2413

See more details on using hashes here.

File details

Details for the file alignmenter-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: alignmenter-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 117.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for alignmenter-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5009b35df1ad59028de5c952601bbfd523fca08bcc680edc313da322662175c8
MD5 ce73435a58ac6ea26af23ef09039a68e
BLAKE2b-256 6819a8a88167303fdc7667a4396d131f961228e269b777859e233d3f2b048710

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page