Skip to main content

Provenance tracking and compliance audit layer for AI-generated content.

Project description

ai-stamp

ai-stamp is a Python package for AI output provenance, audit trails, and compliance support. It wraps LLM calls, records what was generated, scans for PII, stores tamper-evident metadata, and helps verify later whether generated content has changed.

Use it when an application needs a lightweight audit layer around AI-generated content, such as summaries, assistant responses, document drafts, support answers, reports, or other LLM outputs.

What It Does

  • Wraps OpenAI, Anthropic, a generic JSON HTTP endpoint, or a Python callable.
  • Records app, feature, user, model, status, token counts, latency, and time.
  • Hashes prompts and responses for drift and tamper detection.
  • Signs provenance records with HMAC.
  • Verifies current text against stored provenance.
  • Detects common PII in prompts and responses.
  • Supports policy rules that can warn or block risky generations.
  • Stores audit records in SQLite or PostgreSQL.
  • Exports audit reports as JSON or CSV.
  • Provides CLI commands through both aistamp and provenance.
  • Ships type information and packaged Alembic migrations.

PII detection is best-effort only and is not a substitute for certified DLP, privacy review, or legal/compliance tooling. Stored PII evidence snippets redact recognized sensitive spans in their local context.

Install

pip install ai-stamp

For PostgreSQL support:

pip install "ai-stamp[postgres]"

For optional spaCy-based PERSON/ORG detection:

pip install "ai-stamp[nlp]"

For development:

pip install -e ".[dev,postgres,nlp]"

Quick Start

from aistamp import Config, ProvenanceClient, SQLiteBackend

config = Config(secret_key="replace-with-at-least-32-secret-characters")

backend = SQLiteBackend("sqlite:///./aistamp.db")
backend.create_tables()  # local development only

client = ProvenanceClient(
    lambda prompt: f"Generated answer for: {prompt}",
    config=config,
    app_id="my_app",
    feature_id="summarizer",
    user_id="user_42",
    backend=backend,
)

response = client.chat("Summarize this text", model="local-model")
print(response)

Each call stores a provenance record containing hashes, timing, model metadata, PII findings, policy results, and an HMAC signature.

Configuration

Config can be created directly, loaded from environment variables, or loaded from YAML.

Field Environment variable Default Notes
secret_key AISTAMP_SECRET_KEY required HMAC signing key, at least 32 characters
database_url AISTAMP_DATABASE_URL sqlite:///./aistamp.db SQLAlchemy database URL
log_level AISTAMP_LOG_LEVEL INFO DEBUG, INFO, WARNING, ERROR

Example YAML:

secret_key: "a-secret-key-that-is-at-least-32-characters"
database_url: "sqlite:///./aistamp.db"
log_level: "INFO"

Provider Wrapping

OpenAI and Anthropic SDK-style clients are detected automatically when their common response shapes are returned. A plain Python callable can also be used for local models, tests, or custom generation code.

For arbitrary JSON-over-HTTP providers, use GenericHTTPClient:

from aistamp import Config, GenericHTTPClient, ProvenanceClient

http_llm = GenericHTTPClient(
    "https://example.internal/generate",
    headers={"Authorization": "Bearer token"},
)

client = ProvenanceClient(
    http_llm,
    config=Config(secret_key="replace-with-at-least-32-secret-characters"),
    app_id="my_app",
    feature_id="assistant",
    user_id="user_42",
)

text = client.chat("Hello", model="internal-model")

The endpoint receives JSON with prompt and model. It must return text or response. Optional integer fields are prompt_tokens and response_tokens.

Async Usage

from aistamp import AsyncProvenanceClient, AsyncSQLiteBackend, Config

async def generate(prompt: str) -> str:
    return f"Generated answer for: {prompt}"

backend = AsyncSQLiteBackend("sqlite+aiosqlite:///./aistamp.db")
await backend.create_tables()

client = AsyncProvenanceClient(
    generate,
    config=Config(secret_key="replace-with-at-least-32-secret-characters"),
    app_id="my_app",
    feature_id="assistant",
    user_id="user_42",
    backend=backend,
)

response = await client.chat("Write a short summary", model="local-async")

PII Detection

Built-in detection covers:

  • Email addresses
  • US phone numbers
  • US SSNs
  • Luhn-valid 16-digit card numbers
  • API keys
  • IPv4 addresses
  • IPv6 addresses

Custom PII patterns can be loaded from YAML:

patterns:
  - name: employee_id
    regex: "\\bEMP-[0-9]{6}\\b"
    pii_type: CUSTOM
    severity: MEDIUM
from aistamp import load_patterns_from_yaml, scan_text

patterns = load_patterns_from_yaml("pii-patterns.yaml")
result = scan_text("Employee EMP-123456 requested access", extra_patterns=patterns)

Policy Rules

Policy rules can warn or block records based on supported conditions such as model tier and PII severity.

model_tiers:
  approved:
    - gpt-4o
    - internal-safe-model

rules:
  - name: block_high_pii
    action: BLOCK
    reason: "High-severity PII is not allowed"
    conditions:
      pii_severity: HIGH

  - name: warn_unapproved_model
    action: WARN
    reason: "Model is not in the approved tier"
    conditions:
      model_tier: unapproved
from aistamp import PolicyEngine

policy = PolicyEngine.from_yaml("policy.yaml")

Database Setup

SQLite is suitable for local development and simple deployments:

from aistamp import SQLiteBackend

backend = SQLiteBackend("sqlite:///./aistamp.db")
backend.create_tables()

For deployed databases, use the packaged Alembic migrations:

macOS / Linux

export AISTAMP_SECRET_KEY="a-secret-key-that-is-at-least-32-characters"
export AISTAMP_DATABASE_URL="postgresql://user:pass@host/db"
aistamp migrate

Windows (Command Prompt)

set AISTAMP_SECRET_KEY=a-secret-key-that-is-at-least-32-characters
set AISTAMP_DATABASE_URL=postgresql://user:pass@host/db
aistamp migrate

Windows (PowerShell)

$env:AISTAMP_SECRET_KEY="a-secret-key-that-is-at-least-32-characters"
$env:AISTAMP_DATABASE_URL="postgresql://user:pass@host/db"
aistamp migrate

Use SQLiteBackend.create_tables() only in development or tests.

CLI

Both aistamp and provenance run the same CLI.

provenance config-check
provenance audit --content-id abc123
provenance verify --content-id abc123 --text "current text here"
provenance report --from 2024-01-01 --to 2024-06-30
provenance report --model gpt-4o --pii-severity HIGH --format json
provenance report --policy-decision BLOCK --format csv
provenance scan --file output.txt
provenance migrate

Date-only --to filters include the full named UTC calendar date.

Audit And Verification

To verify whether stored content still matches the current text:

from aistamp import Config, SQLiteBackend, verify_record

config = Config(secret_key="replace-with-at-least-32-secret-characters")
backend = SQLiteBackend("sqlite:///./aistamp.db")

result = verify_record(
    content_id="abc123",
    current_text="current text here",
    backend=backend,
    secret_key=config.secret_key,
)

print(result.verified)
print(result.drift_detected)

Package Contents

The distribution includes:

  • aistamp Python package
  • py.typed marker for type checkers
  • CLI entry points: aistamp, provenance
  • Alembic migration files
  • MIT license
  • Tests in the source distribution

Development And Verification

pytest
ruff check aistamp tests
mypy aistamp
python -m build
python -m twine check dist/*

PostgreSQL integration tests require a live test database:

macOS / Linux

export AISTAMP_TEST_POSTGRES_URL="postgresql://user:pass@localhost:5432/aistamp_test"
pytest tests/test_store_postgres.py -rs

Windows (Command Prompt)

set AISTAMP_TEST_POSTGRES_URL=postgresql://user:pass@localhost:5432/aistamp_test
pytest tests/test_store_postgres.py -rs

Windows (PowerShell)

$env:AISTAMP_TEST_POSTGRES_URL="postgresql://user:pass@localhost:5432/aistamp_test"
pytest tests/test_store_postgres.py -rs

The PostgreSQL extra installs psycopg2-binary and asyncpg.

Status

ai-stamp is currently marked Alpha. It is intended for developers who need a small provenance and audit layer around AI-generated content. Review privacy, security, retention, and compliance requirements before using it in regulated production workflows.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_stamp-0.1.0.tar.gz (48.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_stamp-0.1.0-py3-none-any.whl (35.8 kB view details)

Uploaded Python 3

File details

Details for the file ai_stamp-0.1.0.tar.gz.

File metadata

  • Download URL: ai_stamp-0.1.0.tar.gz
  • Upload date:
  • Size: 48.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for ai_stamp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 944bb7fd4ed17255a8d10b6c06333f8256383c1679b40d1d02d0322f39410490
MD5 7b912e47e2a14a42fb3df3372b05e802
BLAKE2b-256 e935ee4f452d2f7a7e50c5438c195614e932da71c30f24c29ef9e7c43c9f959d

See more details on using hashes here.

File details

Details for the file ai_stamp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ai_stamp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 35.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for ai_stamp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 79d86733d8acf00cf06b891117e342dff7ee2bf175bbb478ae14db3b64586314
MD5 3e740f613d5caf97cdc0c1ac99aa1812
BLAKE2b-256 e3eeab8121195a5975eca2d4560285583653d228a4ee043d4ddf140ec47c0c96

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page