A zero-config, local-first privacy layer for AI APIs with semantic-preserving de-identification.
Project description
YinShield
YinShield is a local-first privacy layer for LLM workflows.
当前版本的发布形态是:
- PyPI 包:
yinshield - 本地 HTTP 服务:
yinshield serve - OpenClaw 薄插件:
@serein-213/openclaw-yinshield
Status
- 当前建议发布定位:
0.1.0 alpha - 适用场景:本地单用户隐私层、开发者接入验证、OpenClaw 集成试用
- 当前最稳模式:
mode="placeholder" - 当前仍在持续打磨的部分:
mode="alias"在更真实英文分布下的恢复率与误伤控制
What Works Now
- 中英 PII 脱敏:中文姓名、英文姓名、手机号、US phone、身份证、SSN、邮箱、微信号、银行卡、银行账号、开户行、座机、车牌、护照、统一社会信用代码、税号、公司名、地址、生日、DOB、IP、VIN、EIN、病历号、MRN、订单号、快递单号、tracking number、客户号、会员号、合同号
- 两种替换模式:
mode="placeholder":张三 -> <PERSON_1>mode="alias":张三 -> 陈明
- 三档策略:
loose:只处理高置信实体balanced:默认,适合一般对话和客服文本strict:覆盖更多上下文实体和业务编号
- 会话一致性:同一实体可跨轮保持一致替换,且支持持久化到文件
- OpenAI-compatible 接入:
ShieldedOpenAIShieldedAsyncOpenAIchat.completionsresponsesstream=Truebase_url=...
- 本地 HTTP 服务:
POST /healthPOST /maskPOST /unmaskPOST /messages/mask
- OpenClaw 集成:
yinshield_maskyinshield_unmaskyinshield_shield_messages
Installation
pip install yinshield
For local release validation:
python -m unittest discover -s tests -v
python benchmarks/run_benchmark.py --dataset benchmarks/mini_realistic_dataset.json --mode placeholder --strategy strict --output benchmarks/mini_realistic_results.placeholder.json
python benchmarks/run_benchmark.py --dataset benchmarks/mini_realistic_dataset.json --mode alias --strategy strict --output benchmarks/mini_realistic_results.alias.json
node --check openclaw-plugin/src/index.js
python -m build
Release
Prepare the next release version:
python scripts/sync_release_version.py 0.1.0
python scripts/check_version_consistency.py
Full release steps are documented in RELEASE.md.
Alpha release notes:
Quick Start For OpenClaw
pip install yinshield
python -m yinshield.install_openclaw
openclaw plugins install @serein-213/openclaw-yinshield
openclaw plugins enable openclaw-yinshield
yinshield serve
python -m yinshield.install_openclaw will:
- generate an auth token
- scaffold the OpenClaw plugin config
- print the exact
yinshield serve --auth-token ...command to run
Installed CLI alias:
yinshield-install-openclaw
Shell bootstrap for users who prefer a one-shot script:
bash scripts/setup-openclaw-yinshield.sh
If you later host this script, the curl-style entry can be:
curl -fsSL https://your-domain/setup-openclaw-yinshield.sh | bash
OpenClaw plugin config:
{
"plugins": {
"entries": {
"openclaw-yinshield": {
"enabled": true,
"config": {
"baseUrl": "http://127.0.0.1:27811",
"mode": "placeholder",
"authToken": "change-me"
}
}
}
}
}
Basic Usage
from yinshield import Shield, ShieldSession
shield = Shield(
mode="placeholder", # or "alias"
strategy="balanced", # loose | balanced | strict
)
session = ShieldSession()
raw_text = "收件人:张三,手机号13812345678,收货地址:北京市朝阳区建国路88号。"
masked_text, mapping = shield.mask(raw_text, session=session)
print(masked_text)
restored = shield.unmask(masked_text, session=session)
print(restored)
Session Persistence
from yinshield import Shield
shield = Shield(mode="alias", strategy="strict")
shield.mask("联系人:王小明,手机号13812345678。")
shield.save_session("yinshield-session.json")
another = Shield(mode="alias", strategy="strict")
another.load_session("yinshield-session.json")
masked, _ = another.mask("请再次联系王小明,手机号13812345678。")
Local HTTP Service
Start the bridge:
yinshield serve
Default bind:
- host:
127.0.0.1 - port:
27811
Custom bind:
yinshield serve --host 127.0.0.1 --port 27811 --mode placeholder --strategy balanced --auth-token change-me
HTTP API:
POST /health
{}
POST /mask
{
"text": "我叫张三,手机号13812345678",
"mode": "placeholder",
"session_id": "chat-1"
}
POST /unmask
{
"text": "我叫<PERSON_1>,手机号<PHONE_1>",
"mapping": {
"<PERSON_1>": "张三",
"<PHONE_1>": "13812345678"
}
}
POST /messages/mask
{
"messages": [
{ "role": "user", "content": "我叫张三,手机号13812345678" },
{
"role": "user",
"content": [
{ "type": "text", "text": "订单号20240324ABC123" }
]
}
],
"mode": "placeholder",
"session_id": "chat-1"
}
Notes:
- HTTP service is now stateless by default.
- To reuse aliases/placeholders across turns, pass
session_id. - If
--auth-tokenis omitted,yinshield servegenerates a temporary token and prints it. - To protect the local service, send
Authorization: Bearer <token>.
OpenAI-Compatible Wrapper
from yinshield import ShieldedOpenAI
client = ShieldedOpenAI(
api_key="YOUR_OPENAI_API_KEY",
base_url="https://api.openai.com/v1", # DeepSeek / OpenAI-compatible providers also work
)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "user", "content": "我叫张三,手机号是13812345678"}
],
)
print(response.choices[0].message.content)
# 请求发送前自动脱敏,返回内容自动还原
Current wrapper coverage:
client.chat.completions.create(...)client.chat.completions.create(..., stream=True)client.responses.create(...)client.responses.create(..., stream=True)await async_client.chat.completions.create(...)await async_client.responses.create(...)
Async Wrapper
from yinshield import ShieldedAsyncOpenAI
client = ShieldedAsyncOpenAI(api_key="YOUR_OPENAI_API_KEY")
response = await client.responses.create(
model="gpt-4.1-mini",
input="我叫张三,手机号13812345678",
)
print(response.output_text)
CLI
python -m yinshield --mode alias --strategy strict --session-file .yinshield.json \
"收件人:张三,手机号13812345678,订单号20240324ABC123"
Run local service:
yinshield serve --session-file .yinshield-http-session.json
OpenClaw Installer
python -m yinshield.install_openclaw
Equivalent installed command:
yinshield-install-openclaw
Preview without writing files:
python -m yinshield.install_openclaw --print-only
Benchmark
Local benchmark script:
python benchmarks/run_benchmark.py --mode placeholder --strategy strict
python benchmarks/run_benchmark.py --mode alias --strategy strict
python benchmarks/run_benchmark.py --dataset benchmarks/mini_realistic_dataset.json --mode placeholder --strategy strict --output benchmarks/mini_realistic_results.placeholder.json
python benchmarks/run_benchmark.py --dataset benchmarks/mini_realistic_dataset.json --mode alias --strategy strict --output benchmarks/mini_realistic_results.alias.json
Current sample-set results:
placeholder + strict:precision=1.0 recall=1.0 false_positive_rate=0.0 recovery_rate=1.0 semantic_proxy=0.3662alias + strict:precision=1.0 recall=1.0 false_positive_rate=0.0 recovery_rate=1.0 semantic_proxy=0.8182
Mini realistic-set results:
placeholder + strict:precision=0.9765 recall=0.9765 false_positive_rate=0.0645 recovery_rate=1.0 semantic_proxy=0.321alias + strict:precision=0.954 recall=0.9765 false_positive_rate=0.129 recovery_rate=0.9032 semantic_proxy=0.75
The current sample set includes:
- 中文身份与业务编号
- 英文姓名、US phone、SSN、DOB、EIN、MRN、tracking number
- 中英混合姓名与地址
- 英文地址
Apt/Unit/Suite变体 - 负样例误伤检查
The mini realistic set adds:
- 30 条更接近真实分布的小评测样本
- 中文客服/金融/医疗/物流
- 英文客户资料/合规/医疗/物流
- 中英混合文本
- 更严格的负样例和恢复率检查
semantic_proxy is only a local format-preservation heuristic, not a downstream LLM task benchmark.
Coverage Audit
当前规则覆盖度更接近“中英业务文本的高频显式字段脱敏 + 弱语义上下文识别”,不是通用语义 NER。
已支持:
- 基础身份信息:中文姓名、英文姓名、手机号、US phone、身份证、SSN、生日、DOB、邮箱、微信号
- 地址与位置:中文住址变体、英文街道地址、
Apt/Unit/Suite类英文地址 - 企业与金融:公司名称、统一社会信用代码、税号、EIN、银行卡、银行账号、开户行
- 交通与设备:车牌、护照、VIN、IPv4 地址
- 医疗与业务编号:病历号、MRN、订单号、快递单号、tracking number、客户号、会员号、合同号
部分支持:
- 中文姓名:对“我叫/联系人/收件人/患者”等上下文较强,对自然叙述句中的姓名识别仍有限
- 中文地址:对“省市区路号”类格式较强,对口语化、园区/楼宇简称、缺少行政区前缀的短地址仍有限
- 英文姓名与公司名:对显式字段和部分自然句式较稳,但复杂长句、缩写、跨句引用仍有限
alias模式:在更真实的英文公司名和英文地址场景下,恢复率和误伤率仍弱于placeholder- 企业信息:公司名称和统一社会信用代码/EIN 较稳,但法人、开户名、营业执照号等尚未覆盖
未支持或仍较弱:
- MAC/GPS 坐标/精确地理位置
- 发票号、设备序列号、组织机构代码、车牌以外更多车辆字段
- 真正的语义实体识别、实体消歧、弱上下文推断
Next
- 英文实体支持
- OpenClaw 自动拉起本地服务
- 更稳的上下文识别和实体边界
- Anthropic / LiteLLM / LangChain 接入
- 更真实的下游任务语义评测
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file yinshield-0.1.0.tar.gz.
File metadata
- Download URL: yinshield-0.1.0.tar.gz
- Upload date:
- Size: 41.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4c41a4885564d3ecfeb4f33724a4338cc3329d9970807eb23bc4612d13194b43
|
|
| MD5 |
3b163737636542d14f9bffa58d059bbe
|
|
| BLAKE2b-256 |
16e23965badd354e9dec7d4fed9c9bd2e726f9fd58500c8966f2a0055faab9c3
|
File details
Details for the file yinshield-0.1.0-py3-none-any.whl.
File metadata
- Download URL: yinshield-0.1.0-py3-none-any.whl
- Upload date:
- Size: 41.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7bea2fcf01b10369637d70f632ba9fe8f55b6abe40714a33c111ec56efc6b118
|
|
| MD5 |
e20221f195ffb9a40f4b36475512b2bb
|
|
| BLAKE2b-256 |
415ac8a240b2f8c79e7e676791d1a4e04fee4bdac9a39943b257c881a8297c85
|