Local-first AI security gateway: PII redaction, token compression, and audit ledger
Project description
Airlock
Local-first AI security gateway. Redact PII, cut LLM token costs, and maintain a full audit trail — without sending a single byte to the cloud.
pip install airlock-rs # Python SDK
cargo install airlock-rs # CLI
The Problem
You have logs, support tickets, or user data you want to send to an AI model. But that data contains names, emails, phone numbers, and credit card numbers. Sending it to OpenAI or Claude as-is creates GDPR, HIPAA, and SOC2 exposure.
Airlock sits between your data and the model. It scrubs PII, compresses the JSON to cut token costs, and writes an auditable record of what was processed — all locally, zero network calls.
Python SDK
import airlock, json
records = [
{"user": "Alice Johnson", "email": "alice@corp.com", "action": "login", "ip": "192.168.1.1"},
{"user": "Bob Smith", "email": "bob@corp.com", "action": "logout", "ip": "10.0.0.2"},
]
result = airlock.scrub(json.dumps(records), salt="my-org-secret")
print(result.pii_count) # 6
print(result.risk_score) # 75.0
print(result.reduction_pct) # 38.4
for swap in result.swaps:
print(f"{swap['original']} → {swap['synthetic']}")
# alice@corp.com → alias_a@redacted.dev
# Alice Johnson → User_A
# 192.168.1.1 → IP_A
# bob@corp.com → alias_b@redacted.dev
# Bob Smith → User_B
# 10.0.0.2 → IP_B
# Feed the clean JSON directly to your LLM
response = openai_client.chat(messages=[{"role": "user", "content": result.json_str}])
API
# Scrub PII + compress
result = airlock.scrub(
json_str,
salt=None, # str — secret for stable cross-run aliases
db_path=None, # str — path to SQLite audit ledger
# Toggle individual entity types (all True by default):
names=True,
emails=True,
phones=True,
ssns=True,
credit_cards=True,
ip_addresses=True,
jwt_tokens=True,
aws_keys=True,
env_secrets=True,
)
result.json_str # str — scrubbed, compressed JSON
result.pii_count # int — total PII instances found
result.risk_score # float — 0–100 density score
result.reduction_pct # float — token reduction percentage
result.swaps # list[dict] — [{original, synthetic, entity_type}]
result.ledger_id # int | None — SQLite row ID if db_path was set
# Compress only (no PII detection)
result = airlock.compress(json_str)
result.json_str
result.tokens_before
result.tokens_after
result.reduction_pct
result.entry_count
CLI
# Scrub PII from a JSON or NDJSON file
airlock scrub logs.json --diff
# Stable cross-run aliases (same person → same alias across files)
airlock scrub logs.json --salt my-secret --diff > clean.json
# Compress only
airlock compress logs.json
# View audit history
airlock ledger --last 20
All flags
airlock scrub <FILE>
--salt <SALT> Cross-run stable aliases via SHA-256(salt ‖ entity ‖ token)
--diff Print every original → alias swap to stderr
--db <FILE> SQLite ledger path [default: airlock_ledger.db]
--output <FORMAT> pretty (default) | compact
-v / -vv / -vvv Verbosity (info / debug / trace)
airlock compress <FILE>
--output <FORMAT> pretty | compact
airlock ledger
--last <N> Show N most recent entries [default: 10]
--db <FILE> SQLite ledger path
What Gets Redacted
| PII Type | Standard | Example Input | Alias |
|---|---|---|---|
| Full name | — | Alice Johnson |
User_A |
| RFC 5322 | alice@corp.com |
alias_a@redacted.dev |
|
| Phone | NANP + E.164 | 555-867-5309, +44 7911 123456 |
Phone_A |
| SSN | SSA format | 123-45-6789 |
SSN_A |
| Credit card | ISO/IEC 7812 (Luhn) | 4111 1111 1111 1111 |
Card_A |
| IPv4 address | RFC 791 | 192.168.1.100 |
IP_A |
| JWT token | — | eyJhbGci... |
Token_A |
| AWS access key | — | AKIAIOSFODNN7EXAMPLE |
AwsKey_A |
| Env secret | — | API_KEY=sk-abc123 → API_KEY=Secret_A |
Secret_A |
All 9 types are enabled by default and individually toggleable. Aliases are consistent within a run — User_A always refers to the same person, so AI models can still reason about behavior patterns without seeing real identities.
Config File
Drop a .airlock.toml in your project directory to set defaults:
[scrub]
salt = "my-org-secret" # stable cross-run aliases
db = "~/.airlock/ledger.db" # shared ledger location
[redact]
ip_addresses = false # keep IPs as-is
[[rules]]
name = "EmployeeId"
pattern = "EMP-\\d{5}"
alias_prefix = "Emp" # EMP-00042 → Emp_A
CLI flags always take precedence over the config file.
Token Compression
Repeated JSON keys are expensive for LLMs. Airlock extracts them into a single schema header:
Before (keys repeated on every row):
[
{"timestamp": "2026-01-01T10:00:00Z", "user": "User_A", "action": "login"},
{"timestamp": "2026-01-01T10:01:00Z", "user": "User_B", "action": "logout"}
]
After (keys extracted once, 43% fewer tokens):
{
"__airlock_schema": ["timestamp", "user", "action"],
"__airlock_rows": [
["2026-01-01T10:00:00Z", "User_A", "login"],
["2026-01-01T10:01:00Z", "User_B", "logout"]
],
"__airlock_meta": { "tokens_before": 120, "tokens_after": 68, "reduction_pct": "43.3" }
}
Typical savings: 20–60% on structured log data.
Cross-Run Stable Aliases (--salt)
By default, aliases are assigned in encounter order: the first name seen becomes User_A, the second User_B. This is consistent within a run but may differ between runs.
Pass --salt <secret> to enable cross-run stability: every alias is derived from SHA-256(salt ‖ entity_type ‖ token) fed into a ChaCha8Rng. The same real identity always produces the same alias, regardless of which file is processed or what order records appear in.
airlock scrub january.json --salt prod-2026 > jan_clean.json
airlock scrub february.json --salt prod-2026 > feb_clean.json
# "Alice Johnson" → "User_GKQT" in both files
Keep your salt secret. It is the only thing preventing alias reversal.
Audit Ledger
Every airlock scrub run writes a row to a local SQLite database:
╔══════╦════════════════════╦══════════╦═════════╦══════════╦══════════════════╗
║ ID ║ Timestamp ║ Entries ║ PII ║ Risk ║ Compression ║
╠══════╬════════════════════╬══════════╬═════════╬══════════╬══════════════════╣
║ 1 ║ 2026-01-15T10:00 ║ 500 ║ 84 ║ 42/100 ║ 38.4% ║
║ 2 ║ 2026-01-15T14:22 ║ 1200 ║ 203 ║ 71/100 ║ 51.2% ║
╚══════╩════════════════════╩══════════╩═════════╩══════════╩══════════════════╝
The ledger stores counts and statistics only — never the original PII values.
Security Guarantees
| Zero network calls | Airlock never opens a socket. All processing is in-process on your machine. |
| No PII on disk | The ledger stores counts and risk scores only — never names, emails, or the aliases themselves. |
| Alias irreversibility | In seeded mode, reversing an alias requires knowledge of your salt. |
| Deterministic | The same input + same salt always produces the same output. Fully auditable. |
| No third-party AI | The NER engine runs locally via compiled regex patterns. Your data never touches an external API. |
Installation
Python (recommended)
pip install airlock-rs
Requires Python 3.8+. Pre-built wheels for Linux, macOS, and Windows.
CLI — Cargo
cargo install airlock-rs
CLI — pre-built binary
Download from GitHub Releases. Single static binary, no runtime dependencies.
Building from Source
git clone https://github.com/OxideOps/airlock
cd airlock
# Run tests
cargo test --all-features
# Build release binary
cargo build --release
# Build Python wheel (requires maturin: pip install maturin)
maturin develop --features python
# Lint
cargo clippy --all-features -- -D warnings
Architecture
src/
├── lib.rs — Library entry point; Python module registration
├── main.rs — CLI (clap): scrub / compress / ledger commands
├── types.rs — EntityType, PiiSpan, SwapRecord, LedgerEntry
├── ner.rs — Ner trait + RegexNer (9 built-in patterns + custom rules)
├── scrub.rs — Pipeline: NER → alias → redact → compress → ledger
├── compress.rs — Token-Tax compression (schema extraction + row compaction)
├── ledger.rs — SQLite Risk Ledger (rusqlite, bundled)
└── config.rs — .airlock.toml loader
Performance
- Parallel NER scan and alias application via Rayon
- Static regexes compiled once per process via
OnceLock - Zero-copy span detection using byte offsets
- ~3.5 MB statically-linked binary with no runtime dependencies
License
Apache 2.0 — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file airlock_rs-0.3.0-cp313-cp313-win_amd64.whl.
File metadata
- Download URL: airlock_rs-0.3.0-cp313-cp313-win_amd64.whl
- Upload date:
- Size: 1.9 MB
- Tags: CPython 3.13, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4898e7d3643edf471168fd42eca3530d8d1c1a5d58d6ceb0e90eeb3581cdc37e
|
|
| MD5 |
5b5e42fc599e9e27538882191be2d973
|
|
| BLAKE2b-256 |
3d58ddfb06bf200104a7d3f247129c0120c0aa86249d03fdd337da4fab698226
|
Provenance
The following attestation bundles were made for airlock_rs-0.3.0-cp313-cp313-win_amd64.whl:
Publisher:
release.yml on OxideOps/airlock
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
airlock_rs-0.3.0-cp313-cp313-win_amd64.whl -
Subject digest:
4898e7d3643edf471168fd42eca3530d8d1c1a5d58d6ceb0e90eeb3581cdc37e - Sigstore transparency entry: 1111237472
- Sigstore integration time:
-
Permalink:
OxideOps/airlock@da9d523c8277c3d02259630b4ee0ff0fe073644f -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/OxideOps
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@da9d523c8277c3d02259630b4ee0ff0fe073644f -
Trigger Event:
push
-
Statement type:
File details
Details for the file airlock_rs-0.3.0-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: airlock_rs-0.3.0-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 1.8 MB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
57699f10b30553a40bee3fa58251793c2ef64de603e84f4dd61b291d41165a91
|
|
| MD5 |
28026194f4144a27b72be1cca7bcf91a
|
|
| BLAKE2b-256 |
206994a2051edebda91657928c582971c31181f462e4df023f1d272181d80616
|
Provenance
The following attestation bundles were made for airlock_rs-0.3.0-cp313-cp313-macosx_11_0_arm64.whl:
Publisher:
release.yml on OxideOps/airlock
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
airlock_rs-0.3.0-cp313-cp313-macosx_11_0_arm64.whl -
Subject digest:
57699f10b30553a40bee3fa58251793c2ef64de603e84f4dd61b291d41165a91 - Sigstore transparency entry: 1111237365
- Sigstore integration time:
-
Permalink:
OxideOps/airlock@da9d523c8277c3d02259630b4ee0ff0fe073644f -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/OxideOps
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@da9d523c8277c3d02259630b4ee0ff0fe073644f -
Trigger Event:
push
-
Statement type:
File details
Details for the file airlock_rs-0.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: airlock_rs-0.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 2.0 MB
- Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
557f08535c80ceffcd6d5e1ef4cd0ca74022cdb8604281492c08e166247c7ca7
|
|
| MD5 |
e01ed9e827b637a1a9187c0fe56522e5
|
|
| BLAKE2b-256 |
da6f6c55b1950ef492d6e650fae28fb2c04a728cc3bb9260e62733092aab94f0
|
Provenance
The following attestation bundles were made for airlock_rs-0.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
release.yml on OxideOps/airlock
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
airlock_rs-0.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
557f08535c80ceffcd6d5e1ef4cd0ca74022cdb8604281492c08e166247c7ca7 - Sigstore transparency entry: 1111237420
- Sigstore integration time:
-
Permalink:
OxideOps/airlock@da9d523c8277c3d02259630b4ee0ff0fe073644f -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/OxideOps
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@da9d523c8277c3d02259630b4ee0ff0fe073644f -
Trigger Event:
push
-
Statement type: