Black-box LLM fingerprinting system for model identification

These details have not been verified by PyPI

Project links

Project description

LLM Fingerprinting System

A black-box fingerprinting system that identifies the underlying LLM model family (GPT, LLaMA, Mistral, etc.) by analysing response patterns across 31 carefully selected prompts. The system can identify fine-tuned models as well, tracing them back to their foundational base model.

Note: Check config.py to see all identifiable model families.

A pre-trained classifier is bundled with the package in the model/ directory.

How It Works

Fingerprinting runs in three sequential layers:

31 prompts across 3 layers (discriminative → behavioral → stylistic):
- Discriminative (11): Identity, knowledge cutoff, architecture, reasoning — most separating power
- Behavioral (7): Safety boundaries, jailbreak resistance, honesty, policy handling
- Stylistic (13): Formatting, creativity, constraint following, default voice
Feature extraction per response: 384-dim sentence embeddings + 12 linguistic features + 6 behavioral features = 402 dims per layer, 1206 dims total
Embedding rebalancing: Per-layer PCA compresses 384-dim embeddings to 64 dims → 246-dim working space
Ensemble classification: Random Forest (45%) + SVM (45%) + MLP (10%)
Two-stage identification: Ensemble → model family, Template classifier → specific model version
Early stopping: After each layer the classifier checks confidence — if it exceeds the threshold (default 0.95) the remaining layers are skipped, saving API calls.

Supported Backends

Backend	Description	API Key Required
`ollama`	Local Ollama instance	❌ No
`ollama-cloud`	Ollama Cloud API	✅ `OLLAMA_CLOUD_API_KEY`
`openai`	OpenAI API (or compatible)	✅ `OPENAI_API_KEY`
`gemini`	Gemini API	✅ `GEMINI_API_KEY`
`custom`	Any HTTP-based LLM API	✅ Optional

About the Custom Backend

The custom backend is the most flexible option — use it with:

Proprietary LLM APIs not natively supported
Self-hosted LLMs behind HTTP endpoints
API proxies and gateways
Any HTTP-based LLM service

All you need is an HTTP request template file. See examples in ./example/.

Installation

From PyPI

# Core package
pip install llm-fingerprinter

# With OpenAI support
pip install llm-fingerprinter[openai]

# With Gemini support
pip install llm-fingerprinter[gemini]

# With all backends
pip install llm-fingerprinter[all]

Quick Start

1. Identify a Model (Pre-trained Classifier)

# Local Ollama
llm-fingerprinter identify -b ollama --model llama3.2

# OpenAI
export OPENAI_API_KEY="your-key"
llm-fingerprinter identify -b openai --model gpt-4o-mini

# Custom endpoint
llm-fingerprinter identify -b custom -r ./custom_request.txt

2. Train Your Own Classifier

# Step 1: Generate training fingerprints for each family
#         Temperature is automatically varied across simulations for diversity
llm-fingerprinter simulate -b ollama --model llama3.2 --family llama --num-sims 5
llm-fingerprinter simulate -b openai --model gpt-4o-mini --family gpt --num-sims 5

# Step 2: Train the ensemble classifier
llm-fingerprinter train

# Step 3: Build template classifiers (for two-stage identification)
llm-fingerprinter build-templates
llm-fingerprinter build-model-templates

# Step 4: Identify unknown models
llm-fingerprinter identify -b ollama --model some-unknown-model

`build-templates` — Build Family Template Classifier

Compute per-family mean vectors from training fingerprints for the open-set template classifier. Run after train.

llm-fingerprinter build-templates

The template classifier uses cosine distance to nearest mean — it doesn't require retraining when adding new families.

`build-model-templates` — Build Model-Level Templates

Build templates at the specific model version level (e.g. gpt-4o-mini vs gpt-4.1) for two-stage identification.

llm-fingerprinter build-model-templates

Requires fingerprints that contain model_name in their metadata (all fingerprints generated with simulate on this version do).

`add-family` — Add a New Family Without Retraining

Add a new model family to the template classifier from a few fingerprint samples, without retraining the full ensemble.

llm-fingerprinter add-family --model deepseek-chat --family deepseek --num-fps 3 -b deepseek

Recommended minimum: 3 fingerprints for a reliable mean template.

Environment Variables

Variable	Backend	Description
`OLLAMA_CLOUD_API_KEY`	ollama-cloud	Ollama Cloud API key
`OPENAI_API_KEY`	openai	OpenAI API key
`GEMINI_API_KEY`	gemini	Gemini API key
`DEEPSEEK_API_KEY`	deepseek	DeepSeek API key
`LOG_LEVEL`	all	Logging level (`DEBUG`, `INFO`, `WARNING`)
`LLM_FINGERPRINTER_DATA`	all	Override data directory (fingerprints, model, logs)

License

MIT License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.4.1

Jul 1, 2026

This version

0.4.0

Jun 30, 2026

0.2.0

Feb 19, 2026

0.1.0

Feb 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_fingerprinter-0.4.0.tar.gz (2.6 MB view details)

Uploaded Jun 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_fingerprinter-0.4.0-py3-none-any.whl (2.6 MB view details)

Uploaded Jun 30, 2026 Python 3

File details

Details for the file llm_fingerprinter-0.4.0.tar.gz.

File metadata

Download URL: llm_fingerprinter-0.4.0.tar.gz
Upload date: Jun 30, 2026
Size: 2.6 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for llm_fingerprinter-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`8a2b68c296381895f67f8fcdc8c0a036cbd1fe632cc8e26016c4905a5dca7aa2`
MD5	`f0facacb53449330f600c78c1e0336a1`
BLAKE2b-256	`8bd7a6b1c8c6f8391a0eaff1ba36f5a51c1ee448c229b9487ca5e658c710ca98`

See more details on using hashes here.

File details

Details for the file llm_fingerprinter-0.4.0-py3-none-any.whl.

File metadata

Download URL: llm_fingerprinter-0.4.0-py3-none-any.whl
Upload date: Jun 30, 2026
Size: 2.6 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for llm_fingerprinter-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1963198ad6a9399032d028560876baf66e14c71062a4429591e79a34fbf45749`
MD5	`a8a31368bee1f695ba7d46b23d2ff799`
BLAKE2b-256	`0f966208d16f783df08d3f9511b366deaeafb7d9bab1587b662955952fb23484`

See more details on using hashes here.

llm-fingerprinter 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LLM Fingerprinting System

How It Works

Supported Backends

About the Custom Backend

Installation

From PyPI

Quick Start

1. Identify a Model (Pre-trained Classifier)

2. Train Your Own Classifier

`build-templates` — Build Family Template Classifier

`build-model-templates` — Build Model-Level Templates

`add-family` — Add a New Family Without Retraining

Environment Variables

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

llm-fingerprinter 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LLM Fingerprinting System

How It Works

Supported Backends

About the Custom Backend

Installation

From PyPI

Quick Start

1. Identify a Model (Pre-trained Classifier)

2. Train Your Own Classifier

build-templates — Build Family Template Classifier

build-model-templates — Build Model-Level Templates

add-family — Add a New Family Without Retraining

Environment Variables

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`build-templates` — Build Family Template Classifier

`build-model-templates` — Build Model-Level Templates

`add-family` — Add a New Family Without Retraining