Content anonymizer/pseudonymizer — redact sensitive data before sharing with AI
Project description
privatiser-engine
Open source anonymization engine powering Privatiser. Redacts IPs, API keys, secrets, PII, and cloud identifiers from any text - replacing them with structurally valid pseudonyms so context is preserved. Fully reversible.
Everything runs locally. Nothing leaves the machine.
Available as a Python library/CLI and a browser-native JavaScript port.
What it detects
| Category | Examples | Pseudonym format |
|---|---|---|
| IP addresses | 192.168.1.100, 10.0.0.0/16 |
10.x.x.x (preserves CIDR) |
| Email addresses | admin@company.com |
user-1@redacted.example.net |
| Domain names | prod-db.mycompany.com |
redacted-host-1.example.net |
| MAC addresses | AA:BB:CC:DD:EE:FF |
AA:BB:CC:00:00:01 |
| AWS Account IDs | 123456789012 |
100000000001 |
| AWS ARNs | arn:aws:iam::123...:role/admin |
Structure preserved, values redacted |
| S3 buckets | s3://my-prod-bucket |
s3://redacted-bucket-1 |
| API keys | AWS, OpenAI, Anthropic, Google, Groq, GitHub, Slack, Azure | REDACTED_SECRET_n |
| Connection strings | postgresql://user:pass@host/db |
REDACTED_CONNSTR_n |
| JWT tokens | eyJhbG... |
REDACTED_JWT_n |
| PEM private keys | -----BEGIN RSA PRIVATE KEY----- |
REDACTED_PEM_KEY_n |
| Bearer tokens | Authorization: Bearer sk-... |
REDACTED_BEARER_n |
| Generic secrets | password = "value" |
Keyword preserved, value redacted |
| US phone numbers | (555) 123-4567, +1-555-123-4567 |
(555) 000-0001 |
| UK phone numbers | +44 7911 123456 |
+44 7700 900001 |
| Credit cards | 4111 1111 1111 1111 (Luhn validated) |
4000-0000-0000-0001 |
| US SSN | 123-45-6789 |
078-05-0001 |
| Passports | C12345678 |
X00000001 |
| IBAN | DE89370400440532013000 |
GB00XXXX000000000001 |
| UUIDs | 550e8400-e29b-41d4-... |
00000000-0000-4000-a000-... |
| Azure / GCP IDs | Subscription IDs, project IDs | Redacted with counter |
Skips well-known safe values: 127.0.0.1, 0.0.0.0, localhost, amazonaws.com, github.com, etc.
Python
Install
pip install privatiser
Usage
from privatiser import Privatiser
p = Privatiser()
text = 'server = "192.168.1.100"\npassword = "secret123"'
anonymized, mapping = p.anonymize(text)
# server = "10.0.1.8"
# password = "REDACTED_SECRET_1"
restored = p.deanonymize(anonymized, mapping)
assert restored == text # perfect round-trip
Category toggles
p = Privatiser(enabled_categories={"pii": False}) # skip phone/card/SSN
Allowlist
p = Privatiser(allowlist=["localhost", "example.com"]) # never redact these
Custom patterns
from privatiser import Privatiser, register_custom
register_custom("ticket_id", r"TICKET-\d{4,6}", "REDACTED_TICKET_{n}")
p = Privatiser()
result, mapping = p.anonymize("Fix TICKET-12345")
# result: "Fix REDACTED_TICKET_1"
CLI
# From stdin
cat config.tf | privatiser anonymize
# From file, save mapping
privatiser anonymize config.tf -o clean.tf -m mapping.json
# Restore
privatiser deanonymize clean.tf -m mapping.json
# Disable categories
privatiser anonymize config.tf -d pii -d aws
JavaScript (browser / Node)
The privatiser.js file is a self-contained browser port with no dependencies. Drop it into any web project or use it in Node.
<script src="privatiser.js"></script>
const p = new Privatiser();
const { result, mapping } = p.anonymize(text);
// Restore
const restored = p.deanonymize(result, mapping);
Options
const p = new Privatiser({
enabledCategories: { pii: false }, // disable a category
allowlist: ["localhost", "example.com"], // never redact these
customWords: ["mycompany", "prod-server"], // always redact these
});
How it works
- Placeholder pass - before any pattern runs, detected values are replaced with null-byte markers (
\x00PRIV_0\x00). This prevents patterns from matching inside already-redacted values. - Pattern priority - patterns run highest-priority first (connection strings before passwords, JWTs before base64, etc.).
- Deterministic pseudonyms - the same value always gets the same pseudonym within a session, so repeated occurrences stay consistent.
- Structural preservation - pseudonyms match the format of the original (IPs look like IPs, emails look like emails) so downstream tools and AI models aren't confused.
- Restore pass -
deanonymize()does a simple string replacement of pseudonyms back to originals using the mapping.
Project structure
src/privatiser/
core.py - Privatiser class, anonymize/deanonymize logic
patterns/
secrets.py - API keys, JWTs, connection strings, PEM keys
network.py - IPs, domains, emails, MACs, URLs
pii.py - phone, credit card, SSN, passport, IBAN
aws.py - AWS account IDs, ARNs, S3 buckets
cloud.py - Azure, GCP identifiers
identifiers.py - UUIDs, generic identifiers
cli.py - Click CLI entrypoint
web/ - Flask web UI (optional)
privatiser.js - Self-contained browser/Node JS port
tests/ - pytest test suite
Contributing
See CONTRIBUTING.md. Pattern contributions are especially welcome - if you work with a format that Privatiser doesn't detect yet, opening a PR with a new pattern + tests is the fastest way to get it added.
Attribution
MIT licensed - use it freely in personal and commercial projects. If you build something with it, a "Powered by Privatiser" credit is appreciated but not required.
License
MIT - see LICENSE.
Built and maintained by @XionDot. Web tool at privatiser.net.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file privatiser-0.5.0.tar.gz.
File metadata
- Download URL: privatiser-0.5.0.tar.gz
- Upload date:
- Size: 29.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
de44f158da6347959642403e15f717675bb335d695a92b8f3b72efb990141ddd
|
|
| MD5 |
60ad56d388b345770e08d3ec1d0be431
|
|
| BLAKE2b-256 |
43ae68bfb009307085a46e925620c59a7102c949e90cc6f2c4df70c3f6f2e4d0
|
File details
Details for the file privatiser-0.5.0-py3-none-any.whl.
File metadata
- Download URL: privatiser-0.5.0-py3-none-any.whl
- Upload date:
- Size: 21.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
784f79c3d9e060a1aba03ffc85cd4b311aa08c1efb73434dbf16c2fdbee7d254
|
|
| MD5 |
08716ca69343f7222897d5c2e29d6cde
|
|
| BLAKE2b-256 |
d448a94dbeb39703a2a636f3f20512ca61479ace0c78027d9edc7b3419328b26
|