Skip to main content

PIIFILL: Professional Local-Logic PII Sanitization CLI

Project description

🛡️ PIIFILL: High-Performance PII Redaction CLI & Data Masking Tool

The ultimate PII filler and sensitive information masker. Secure your data with 100% offline, OCR-powered redaction.

PyPI version License: MIT Downloads

Built with precision by Bhavin Sachaniya

OverviewInstallationBasic UsageSecurity AnalyticsSupported Formats


📖 What is PIIFILL?

PIIFILL is a high-performance command-line utility designed to automatically detect and redact Personally Identifiable Information (PII) from documents, datasets, and images. It serves as a comprehensive PII filler and data anonymization tool, ensuring that sensitive information is masked before files are shared or processed.

🛡️ Key Capabilities & PII Detection

Entity Type Examples Detected Technology Used
Personal Identifiers Names, Phone Numbers, Email Addresses Pattern Matching & NLP
Government IDs SSN (USA), Aadhaar (India), PAN Regional Regex
Financial Data Credit/Debit Cards, IBAN, Swift Codes Luhn Algorithm & Patterns
Location Info Physical Addresses, ZIP Codes, IP Addresses Geo-Patterns
Visual PII Text in Images, Scanned PDFs, Screenshots Integrated OCR

[!IMPORTANT] 100% Offline Processing: PIIFILL is built for maximum privacy. All detection, masking, and OCR processing happen locally on your machine. Your data is never uploaded to any cloud service.


🚀 Quick Start

Installation

Ensure you have Python 3.8+ installed. You can install PIIFILL directly via pip:

pip install piifill-cli==0.1.8

🛠️ Usage Guide

PIIFILL follows a simple two-phase workflow: Scan (To identify) and Mask (To protect).

1. Identify Privacy Risks (scan)

Use the scan command to audit your files. This is a read-only operation that provides a detailed report of potential PII without modifying your source files.

# Scan a single document
piifill scan sensitive_data.pdf

# Perform a deep search in a folder (recursive)
piifill scan ./private_docs/ --recursive

2. Protect Your Files (mask)

Once verified, use mask to generate sanitized versions of your files. By default, it creates an out/ directory with the protected copies.

# Mask a single file
piifill mask user_records.csv

# Mask all files in a directory
piifill mask ./raw_logs/

📂 Working with Folders

Want to clean up an entire folder of data? PIIFILL makes it easy.

Example: Mask every file in a folder

piifill mask ./data_dump/
  • PIIFILL will scan every file in ./data_dump/.
  • It will create a new folder called ./data_dump/out/.
  • All your safe, cleaned-up files will be waiting for you inside the out folder!

Example: Save the safe files somewhere specific

piifill mask ./private_files/ -o ./safe_backup/
  • This takes everything from private_files and puts the safe versions in safe_backup.

⚙️ Command Reference

Command Description Key Options
scan Detects PII and generates a risk report. --recursive, --format
mask Redacts PII and creates safe file copies. -o (output), --mode
config Displays current PIIFILL configuration. N/A
version Displays version and environment info. N/A

🎭 Masking Modes

You can customize how PII is hidden using the --mode flag:

  • mask (Default): Replaces data with descriptive placeholders (e.g., [REDACTED]).
  • redact: Completely removes the sensitive data from the file.
  • tokenize: Replaces data with unique, trackable tokens (e.g., <EMAIL_123>).

📊 Security & Risk Analytics

PIIFILL doesn't just hide data—it helps you understand your privacy posture through integrated analytics:

  • Security Grade: A standardized rating (A to F) based on PII density.
  • Risk Score (0-100): A quantitative metric representing the severity of data exposure.
  • Frequency Analysis: A detailed breakdown of detected entities (e.g., "5 Credit Cards, 12 Emails found").

📂 Supported File Formats & OCR

PIIFILL supports a wide range of formats, including advanced OCR (Optical Character Recognition) support for image-based documents, making it the most versatile PII filler for mixed-media datasets.

Category Extensions Features
Structured Data .csv, .json, .sql, .xlsx Row-level masking & Tokenization
Documents .txt, .pdf, .docx Paragraph-aware redaction
Images (OCR) .png, .jpg, .jpeg Text coordinate detection & Masking

[!TIP] Deep Image Detection: PIIFILL uses built-in OCR capabilities to detect and mask text hidden inside screenshots and scanned documents automatically.


❓ Frequently Asked Questions (FAQ)

1. How do I redact PII from documents offline?

You can use piifill mask <file_path> to redact PII locally. Since PIIFILL processes all data on your machine, it is the safest way to handle sensitive documents without cloud exposure.

2. Is PIIFILL a free PII filler?

Yes, PIIFILL is an open-source tool licensed under MIT. You can use it for both personal and commercial projects at no cost. See pricing.md for details.

3. Does PIIFILL support Aadhaar and SSN masking?

Yes, PIIFILL has built-in support for global identifiers including US Social Security Numbers (SSN) and Indian Aadhaar card details.

4. Can PIIFILL detect PII in screenshots?

Absolutely. PIIFILL includes OCR (Optical Character Recognition) to find and redact PII text inside images like .png and .jpg.


🤖 AI & Agent Readiness

This repository is optimized for AI search (GEO/AEO). AI agents can find structured metadata in the following files:


👤 Author

Bhavin Sachaniya


📜 License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

piifill_cli-0.1.9.tar.gz (38.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

piifill_cli-0.1.9-py3-none-any.whl (42.4 kB view details)

Uploaded Python 3

File details

Details for the file piifill_cli-0.1.9.tar.gz.

File metadata

  • Download URL: piifill_cli-0.1.9.tar.gz
  • Upload date:
  • Size: 38.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for piifill_cli-0.1.9.tar.gz
Algorithm Hash digest
SHA256 610e5cedfbfb214dc831b67cc05ccfc36f799a36a369c2709d3b6b3df75e0059
MD5 3503d16d947bc07a0c11e1066b9dd464
BLAKE2b-256 c3fa856888bfe04bdbb4b89a38fd9bcc9b7fa6beda5af63effa8b3a20c9a834c

See more details on using hashes here.

File details

Details for the file piifill_cli-0.1.9-py3-none-any.whl.

File metadata

  • Download URL: piifill_cli-0.1.9-py3-none-any.whl
  • Upload date:
  • Size: 42.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for piifill_cli-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 58a6472fc922d04fcc521b51fc678417dd03862492bc9ba42b9ccb357def101c
MD5 370e4d0a8de967faeaf61ebe3027fa25
BLAKE2b-256 83ac61bc3a3f7a289ce6e7c26829c7e476100dbd51642f1a452118e29c550413

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page