Skip to main content

LLM distillation detection and model fingerprint audit tool - text source detection, model identity verification, and distillation analysis

Project description

ModelAudit

LLM 蒸馏检测与模型指纹审计 — 文本溯源、身份验证、蒸馏关系判定 LLM distillation detection & model fingerprinting – text provenance, identity verification, distillation auditing

PyPI Python 3.10+ License: MIT MCP

快速开始 · 检测方法 · MCP Server · Data Pipeline 生态


GitHub Topics: model-fingerprint, llm-distillation, model-audit, cli, mcp, ai-data-pipeline

检测文本数据来源、验证 API 模型身份、审计模型蒸馏关系。黑盒优先,标注员友好。

核心能力 / Core Capabilities

文本/模型 → 探测 Prompt → 响应特征提取 → 指纹比对 → 审计报告

审计仪表盘预览 / Sample Dashboard

┌───────────────────────────────────────────────┐
│  模型蒸馏审计报告                              │
├───────────────┬──────────────┬────────────────┤
│ 教师: gpt-4o  │ 学生: my-llm │ 相似度: 0.9213 │
├───────────────┴──────────────┴────────────────┤
│ ⚠️  判定: 可能存在蒸馏关系                      │
│ 📊 置信度: 87.5%                               │
│ 🔍 风格匹配: helpful 0.82 / hedging 0.79       │
└───────────────────────────────────────────────┘

功能矩阵 / Features

功能 说明
🔍 文本来源检测 判断一批文本是哪个 LLM 生成的
模型身份验证 验证 API 背后是不是声称的模型
🔗 模型指纹比对 比对两个模型的行为特征相似度
📋 蒸馏审计报告 综合分析生成 Markdown / JSON 报告

安装 / Installation

pip install knowlyr-modelaudit

可选依赖:

pip install knowlyr-modelaudit[blackbox]   # 黑盒指纹 (openai, anthropic, httpx)
pip install knowlyr-modelaudit[whitebox]   # 白盒指纹 (torch, transformers)
pip install knowlyr-modelaudit[mcp]        # MCP 服务器
pip install knowlyr-modelaudit[all]        # 全部功能

快速开始 / Quick Start

检测文本来源 / CLI

# 检测文本数据是哪个模型生成的
knowlyr-modelaudit detect texts.jsonl

# 限制条数,输出 JSON
knowlyr-modelaudit detect texts.jsonl -n 50 -f json -o result.json
输出示例
正在分析 3 条文本...

  ID | 预测模型   |   置信度 | 预览
------------------------------------------------------------
   1 |    chatgpt |  72.50% | Certainly! I'd be happy to...
   2 |    chatgpt |  65.00% | I think that's an interest...
   3 |    chatgpt |  70.00% | Sure thing! No problem at ...

来源分布:
  chatgpt: 3 (100.0%)

验证模型身份

# 验证 API 背后是不是声称的 GPT-4o
knowlyr-modelaudit verify gpt-4o --provider openai

# 自定义 API
knowlyr-modelaudit verify my-model --provider custom --api-base http://localhost:8000

比对模型指纹

# 比对两个模型是否存在蒸馏关系
knowlyr-modelaudit compare gpt-4o claude-sonnet --provider openai

完整蒸馏审计

# 生成审计报告
knowlyr-modelaudit audit --teacher gpt-4o --student my-model -o report.md
输出示例
正在审计: gpt-4o → my-model...

判定结果: ⚠️  可能存在蒸馏关系
置信度: 0.8750

教师模型 gpt-4o 与学生模型 my-model 的行为模式高度相似,
可能存在蒸馏关系。置信度: 87.50%

报告已保存: report.md

在 Python 中接入 / Python SDK

from modelaudit import AuditEngine

engine = AuditEngine()

# 检测文本来源
results = engine.detect(["Hello! I'd be happy to help..."])
for r in results:
    print(f"{r.predicted_model}: {r.confidence:.2%}")

# 比对模型指纹 (需要 API key)
result = engine.compare("gpt-4o", "my-model", method="llmmap")
print(f"相似度: {result.similarity:.4f}")
print(f"蒸馏关系: {'是' if result.is_derived else '否'}")

检测方法 / Detection Methods

已实现

方法 类型 说明 参考
LLMmap 黑盒 发送探测 Prompt,分析响应模式 USENIX Security 2025
StyleAnalysis 风格分析 词频、句法、风格标记匹配

规划中

方法 类型 说明 参考
REEF 白盒 CKA 隐层相似度比对 ICLR 2025 Oral
DLI 蒸馏检测 影子模型 + 行为签名 ICLR 2026

查看可用方法

knowlyr-modelaudit methods

MCP Server

在 Claude Desktop / Claude Code 中直接使用。

配置

添加到 ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "knowlyr-modelaudit": {
      "command": "uv",
      "args": ["--directory", "/path/to/model-audit", "run", "python", "-m", "modelaudit.mcp_server"]
    }
  }
}

可用工具

工具 功能
detect_text_source 检测文本数据来源
verify_model 验证模型身份
compare_models 比对两个模型指纹
audit_distillation 完整蒸馏审计

使用示例

用户: 帮我检测这批文本是哪个模型生成的

Claude: [调用 detect_text_source]

        ## 文本来源检测结果

        | # | 预测模型 | 置信度 | 预览 |
        |---|---------|--------|------|
        | 1 | chatgpt | 72.50% | Certainly! I'd be happy... |

        ### 来源分布
        - chatgpt: 3 (100.0%)

Data Pipeline 生态

ModelAudit 是 Data Pipeline 生态的模型质检组件:

graph LR
    Radar["🔍 Radar<br/>情报发现"] --> Recipe["📋 Recipe<br/>逆向分析"]
    Recipe --> Synth["🔄 Synth<br/>数据合成"]
    Recipe --> Label["🏷️ Label<br/>数据标注"]
    Synth --> Check["✅ Check<br/>数据质检"]
    Label --> Check
    Check --> Audit["🔬 Audit<br/>模型审计"]
    Audit --> Hub["🎯 Hub<br/>编排层"]
    Hub --> Sandbox["📦 Sandbox<br/>执行沙箱"]
    Sandbox --> Recorder["📹 Recorder<br/>轨迹录制"]
    Recorder --> Reward["⭐ Reward<br/>过程打分"]
    style Audit fill:#0969da,color:#fff,stroke:#0969da

生态项目

项目 说明 仓库
情报 AI Dataset Radar 数据集竞争情报、趋势分析 GitHub
分析 DataRecipe 逆向分析、Schema 提取、成本估算 GitHub
生产 DataSynth LLM 批量合成、种子数据扩充 GitHub
生产 DataLabel 轻量标注工具、多标注员合并 GitHub
质检 DataCheck 规则验证、重复检测、分布分析 GitHub
质检 ModelAudit 蒸馏检测、模型指纹、身份验证 You are here
Agent AgentSandbox Docker 执行沙箱、轨迹重放 GitHub
Agent AgentRecorder 标准化轨迹录制、多框架适配 GitHub
Agent AgentReward 过程级 Reward、Rubric 多维评估 GitHub
编排 TrajectoryHub Pipeline 编排、数据集导出 GitHub

端到端工作流

# 1. DataRecipe: 分析数据集,生成 Schema 和样例
knowlyr-datarecipe deep-analyze tencent/CL-bench -o ./output

# 2. DataSynth: 基于种子数据批量合成
knowlyr-datasynth generate ./output/tencent_CL-bench/ -n 1000

# 3. DataCheck: 数据质量检查
knowlyr-datacheck validate ./output/tencent_CL-bench/

# 4. ModelAudit: 检测合成数据来源,验证模型身份
knowlyr-modelaudit detect ./output/synthetic.jsonl
knowlyr-modelaudit verify gpt-4o --provider openai

组合 MCP 配置

{
  "mcpServers": {
    "knowlyr-datarecipe": {
      "command": "uv",
      "args": ["--directory", "/path/to/data-recipe", "run", "knowlyr-datarecipe-mcp"]
    },
    "knowlyr-datacheck": {
      "command": "uv",
      "args": ["--directory", "/path/to/data-check", "run", "python", "-m", "datacheck.mcp_server"]
    },
    "knowlyr-modelaudit": {
      "command": "uv",
      "args": ["--directory", "/path/to/model-audit", "run", "python", "-m", "modelaudit.mcp_server"]
    }
  }
}

命令参考

命令 功能
knowlyr-modelaudit detect <file> 检测文本数据来源
knowlyr-modelaudit detect <file> -n 50 限制检测条数
knowlyr-modelaudit verify <model> 验证模型身份
knowlyr-modelaudit compare <a> <b> 比对两个模型指纹
knowlyr-modelaudit audit --teacher <a> --student <b> 完整蒸馏审计
knowlyr-modelaudit methods 列出可用检测方法

API 使用

from modelaudit import AuditEngine, Fingerprint, ComparisonResult

# 创建引擎
engine = AuditEngine()

# 检测文本来源
results = engine.detect(texts)
for r in results:
    print(f"#{r.text_id} {r.predicted_model} ({r.confidence:.2%})")

# 指纹比对 (需要 API key)
result = engine.compare("gpt-4o", "my-model", method="llmmap")
print(f"相似度: {result.similarity:.4f}")

# 完整审计
audit = engine.audit("gpt-4o", "my-model")
print(audit.verdict)       # likely_derived / independent / inconclusive
print(audit.confidence)    # 0.875

# 生成报告
from modelaudit.report import generate_report
report = generate_report(audit, "markdown")

项目架构

src/modelaudit/
├── engine.py         # AuditEngine 总入口
├── models.py         # Pydantic 数据模型
├── base.py           # Fingerprinter 抽象基类
├── registry.py       # 方法注册表
├── config.py         # 配置
├── methods/
│   ├── llmmap.py     # LLMmap 黑盒指纹
│   └── style.py      # 风格分析
├── probes/
│   └── prompts.py    # 探测 Prompt 库
├── report.py         # 报告生成
├── cli.py            # CLI 命令行 (5 命令)
└── mcp_server.py     # MCP Server (4 工具)

License

MIT


AI Data Pipeline 生态

10 个工具覆盖 AI 数据工程全流程,均支持 CLI + MCP,可独立使用也可组合成流水线。

Tool Description Link
AI Dataset Radar Competitive intelligence for AI training datasets GitHub
DataRecipe Reverse-engineer datasets into annotation specs & cost models GitHub
DataSynth Seed-to-scale synthetic data generation GitHub
DataLabel Lightweight, serverless HTML labeling tool GitHub
DataCheck Automated quality checks & anomaly detection GitHub
ModelAudit LLM distillation detection & model fingerprinting You are here
AgentSandbox Reproducible Docker sandbox for Code Agent execution GitHub
AgentRecorder Standardized trajectory recording for Code Agents GitHub
AgentReward Process-level rubric-based reward engine GitHub
TrajectoryHub Pipeline orchestrator for Agent trajectory data GitHub
graph LR
    A[Radar] --> B[Recipe] --> C[Synth] --> E[Check] --> F[Audit] --> G[Hub]
    B --> D[Label] --> E
    G --> H[Sandbox] --> I[Recorder] --> J[Reward]

为数据团队提供模型质量保障与蒸馏审计能力

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

knowlyr_modelaudit-0.1.0.tar.gz (25.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

knowlyr_modelaudit-0.1.0-py3-none-any.whl (29.6 kB view details)

Uploaded Python 3

File details

Details for the file knowlyr_modelaudit-0.1.0.tar.gz.

File metadata

  • Download URL: knowlyr_modelaudit-0.1.0.tar.gz
  • Upload date:
  • Size: 25.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for knowlyr_modelaudit-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c89b48a1ee96100cebace4638c53905ff2f29ff7da6d038e7a817fac098d63c7
MD5 6658880eb43a264302f8b6cde69e31d7
BLAKE2b-256 b762c8dd72d29cd67c81a262c6696d2dd5201dbf090f04ef0e1c417699f75b72

See more details on using hashes here.

File details

Details for the file knowlyr_modelaudit-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: knowlyr_modelaudit-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 29.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for knowlyr_modelaudit-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9ad7c67cd2306b43ef47143c916621239241e68d8103b6cddf3d0bab17fecb65
MD5 c08c9d869a5118845ce328dcd04739dd
BLAKE2b-256 30f3dec09ca3d21c54f96fd55ced7397dd7e0bfd1541deeb344cfb07c6b01e2d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page