Skip to main content

Automatic PII masking for OpenAI and Anthropic SDKs

Project description

Armos

PII never reaches your LLM. One line of code.

Armos wraps the OpenAI and Anthropic SDKs to automatically detect and mask personally identifiable information (PII) before it leaves your server — and restore the real values in the response. Your application code changes by exactly one word.

License: MIT Python 3.9+ PyPI version GitHub Stars


The problem

Every time your application calls an LLM, it sends raw text to a third-party server. If a user's message contains their name, Aadhaar number, email, PAN card, or credit card — that data leaves your infrastructure.

This matters for:

  • Healthcare apps — patient names, dates of birth, medical IDs
  • Fintech apps — PAN, Aadhaar, bank details
  • Customer support tools — names, emails, phone numbers, addresses
  • Any app where users type free text that gets sent to OpenAI or Anthropic

Most teams know this is a risk. Few have time to build a proper masking layer before shipping. Armos is that layer, pre-built.


How it works

How Armos works

Detection runs entirely on your machine. Presidio + spaCy analyse the text locally. No data is sent to any Armos server — there is no Armos server. The vault (token ↔ real value map) lives in your process memory, or optionally in your own Redis instance.


Why Armos over alternatives?

vs. building your own: A custom masking layer takes weeks to build correctly and months to handle edge cases. Armos is a pip install.

vs. LLM Guard: LLM Guard focuses on prompt injection and toxicity — not PII masking. Different problem.

vs. Presidio directly: Presidio detects PII but doesn't handle tokenization, vault management, or SDK integration. Armos wraps all of that.

Indian PII first-class: Aadhaar and PAN detection built in. No competitor handles Indian identifiers reliably.


Quickstart

Install

pip install armos

For Redis-backed persistence across requests:

pip install armos[redis]

Note: On first use, download the spaCy language model:

python -m spacy download en_core_web_lg

OpenAI

# Before
from openai import OpenAI
client = OpenAI()

# After — one import added, one word changed
from openai import OpenAI
from armos import ArmosOpenAI

client = ArmosOpenAI(OpenAI())

# Everything else is identical
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": "Summarise the case for patient John Smith, Aadhaar 2345 6789 0123"
    }]
)

# Real values are restored in the response automatically
print(response.choices[0].message.content)

Anthropic

from anthropic import Anthropic
from armos import ArmosAnthropic

client = ArmosAnthropic(Anthropic())

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": "Patient John Smith, DOB 12/04/1982, PAN ABCDE1234F"
    }]
)

print(message.content[0].text)  # real values restored

With Redis (persistent vault across requests)

# Token mappings survive across processes and requests
client = ArmosOpenAI(OpenAI(), store="redis", redis_url="redis://localhost:6379")
client = ArmosAnthropic(Anthropic(), store="redis", redis_url="redis://localhost:6379")

# Custom TTL (default: 24 hours)
client = ArmosOpenAI(OpenAI(), store="redis", redis_url="redis://localhost:6379", vault_ttl=3600)

Standalone (any LLM or framework)

from armos import Armos

guard = Armos()

result = guard.mask("Patient John Smith, Aadhaar 2345 6789 0123, email john@hospital.com")
print(result.text)
# → "Patient [PII:NAME:a1b2c3d4], Aadhaar [PII:AADHAAR:b2c3d4e5], email [PII:EMAIL:e5f6g7h8]"

print(result.has_pii)  # True

restored = guard.demask(result.text)
print(restored)
# → "Patient John Smith, Aadhaar 2345 6789 0123, email john@hospital.com"

What gets detected

Entity Token Example
Person name [PII:NAME:…] John Smith
Email address [PII:EMAIL:…] john@hospital.com
Phone number [PII:PHONE:…] +91 98765 43210
Aadhaar number [PII:AADHAAR:…] 2345 6789 0123
PAN card [PII:PAN:…] ABCDE1234F
Credit / debit card [PII:CARD:…] 4111 1111 1111 1111
IP address [PII:IP:…] 192.168.1.100
API keys & secrets [PII:APIKEY:…] sk-abc123… / AKIA… / ghp_…

Token design

Tokens are deterministic and normalisation-aware:

"john smith"  →  [PII:NAME:a1b2c3d4]  ← stored: "john smith"
"John Smith"  →  [PII:NAME:a1b2c3d4]  ← same token, vault unchanged
"JOHN SMITH"  →  [PII:NAME:a1b2c3d4]  ← same token, vault unchanged

All casing variants of the same name map to one token. The LLM sees one consistent entity across a conversation — not three different people. De-masking restores the first-seen value.


Vault options

Option Default Use when
In-memory Armos() Single request or single process
Redis Armos(store="redis", redis_url="redis://…") Multi-turn conversations, multiple workers, or across requests

In-memory vault is zero configuration and the default. Redis vault persists token mappings so a token created in request 1 can be de-masked in request 5.


Token overhead

Masking replaces PII values with tokens like [PII:NAME:a1b2c3d4]. These are longer than the original values, adding a small number of tokens to each request. Measured with GPT-4 tokenization (cl100k_base):

Entity type Example Original tokens Masked tokens Overhead
NAME John Smith 2 10 +8
EMAIL john@example.com 3 13 +10
AADHAAR 2345 6789 0123 8 13 +5
PAN ABCDE1234F 4 11 +7
PHONE +91 98765 43210 8 12 +4
IP 192.168.1.100 7 11 +4
Average 6 11 +5

In practice: a message with 4 PII entities adds ~20 tokens to the request, plus a one-time 13-token system hint injected when PII is detected. For a typical 200-token prompt this is a ~15% increase — negligible against LLM pricing at scale.


Performance

Detection and masking run entirely in-process with no network calls. Benchmarked on Apple M-series (50 runs, median / p95):

Armos latency benchmark

Text size Memory — p50 Memory — p95 Redis — p50 Redis — p95
Short (~20 tokens) 2.5 ms 2.7 ms 3.6 ms 3.9 ms
Medium (~60 tokens) 6.0 ms 6.4 ms 8.6 ms 9.0 ms
Long (~150 tokens) 13.3 ms 13.9 ms 19.4 ms 20.5 ms

Redis overhead is the localhost round-trip cost (~1–2 ms per vault operation). Both are negligible relative to LLM response times (typically 500 ms–5 s).


v1 limitations

  1. Streaming not supportedstream=True passes through without masking. (v1.1)
  2. Async clients not supportedAsyncOpenAI, AsyncAnthropic pass through without masking. (v1.1)
  3. OpenAI Responses API not interceptedclient.responses.create() passes through. (v1.1)
  4. Embeddings not maskedclient.embeddings.create() sends text as-is. (v1.1)
  5. Indian name accuracyen_core_web_lg is trained on English text; Indian names have lower recall than Western names. Fine-tuning planned for v2.
  6. Casing: first-seen wins — De-masking always restores the first-seen casing of an entity. Use consistent casing in your prompts for exact restoration.
  7. Token length[PII:NAME:a1b2c3d4] is 18 chars vs John (4 chars). Near context-window limits this may push content over. Rare in practice.

Contributing

Armos is open source and MIT licensed. Issues and pull requests welcome.

git clone https://github.com/armos-ai/armos-python
cd armos-python
pip install -e ".[dev,all]"
python -m spacy download en_core_web_lg
pytest tests/ -v

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

armos-0.1.9.tar.gz (547.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

armos-0.1.9-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file armos-0.1.9.tar.gz.

File metadata

  • Download URL: armos-0.1.9.tar.gz
  • Upload date:
  • Size: 547.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for armos-0.1.9.tar.gz
Algorithm Hash digest
SHA256 5feba2feba3ef8618ea824624b34be67689110f9184ebe27df1597d7627b2fa7
MD5 205f86ae035a2c3cab83455a3294acb2
BLAKE2b-256 d4d10d976b3a2f76b129b16264d519a082f1fa4c4f6f5f0f7c1941fd1f5595b4

See more details on using hashes here.

File details

Details for the file armos-0.1.9-py3-none-any.whl.

File metadata

  • Download URL: armos-0.1.9-py3-none-any.whl
  • Upload date:
  • Size: 18.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for armos-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 fb5f054287bc27b859f3404498b1f5ba4d745be8faaf7f8b648096c26a880113
MD5 82dfc99f33588ba77df9d17a0fca98c9
BLAKE2b-256 619116708c7a7e5a72be1d4526df7f051327fb3a0e9fb9f218d16d3d6b4ef1d5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page