CLI for creating PII-safe Excel test fixtures
Project description
sheetmask
Turn a real Excel file into a safe test fixture — fake names, fake numbers, real structure.
Install
pip install git+https://github.com/daniel-butler/sheetmask.git
uv add git+https://github.com/daniel-butler/sheetmask.git
Quickstart
- Run
analyzeon your file. It prints a prompt describing the columns and sample data — copy it.
sheetmask analyze "Q4 Expense Report.xlsx"
- Paste the prompt into Claude or ChatGPT. Save the config it returns:
# q4_expense_config.py
from sheetmask import PercentageVarianceRule, PreserveRelationshipRule
config = {
"version": "1.0.0",
"sheets_to_keep": ["Expenses"],
"entity_columns": {
"Employee Name": "PERSON",
"Department": "ORGANIZATION",
"Manager": "PERSON",
},
"numeric_rules": {
"Reimbursement": PercentageVarianceRule(variance_pct=0.2),
"Net Amount": PreserveRelationshipRule(
formula="context['Reimbursement'] - context['Deduction']",
dependent_columns=["Reimbursement", "Deduction"],
),
},
"preserve_columns": ["Date", "Category"],
}
- Run
process. The output lands beside the original.
sheetmask process "Q4 Expense Report.xlsx" --config q4_expense_config.py
# Output: Q4 Expense Report_SYNTHETIC.xlsx
Reference
Entity types
Each unique value maps to the same fake value throughout the file, so relationships between rows stay intact.
| Type | Generates |
|---|---|
PERSON |
Full name |
PERSON_FIRST_NAME |
First name only |
PERSON_LAST_NAME |
Last name only |
ORGANIZATION |
Company name |
EMAIL_ADDRESS |
Email address |
PHONE_NUMBER |
Phone number |
PROJECT_NAME |
Project name |
LOCATION |
City, State |
Numeric rules
PercentageVarianceRule replaces each value with a random number within a band of the original. Use it for independent figures.
"Headcount": PercentageVarianceRule(variance_pct=0.15)
# 100 becomes a random number between 85 and 115.
PreserveRelationshipRule derives a value from other already-anonymized columns. Use it wherever one column is computed from others, so the arithmetic stays consistent.
"Gross Margin": PreserveRelationshipRule(
formula="context['Revenue'] - context['Cost']",
dependent_columns=["Revenue", "Cost"],
)
# Gross Margin will always equal anonymized Revenue minus anonymized Cost.
All commands
| Command | Description |
|---|---|
sheetmask analyze <file> |
Analyze file and print LLM prompt |
sheetmask analyze <file> -o prompt.txt |
Save LLM prompt to a file |
sheetmask analyze-multi f1 f2 f3 |
Analyze multiple files for shared schema patterns |
sheetmask process <file> --config config.py |
Anonymize file using config |
sheetmask process <file> out.xlsx --config config.py --seed 42 |
Write to named output with fixed random seed |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sheetmask-0.1.2.tar.gz.
File metadata
- Download URL: sheetmask-0.1.2.tar.gz
- Upload date:
- Size: 85.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9d3f8abdd7748d1170fef9c50ddc2b4a0f6ca84f59ae501db3a4608e2fbfd345
|
|
| MD5 |
da3c57e78277569b9e4a857581c53c6d
|
|
| BLAKE2b-256 |
7e39641fb27f36e5eb5875878760d12cb2166156edb3932add6bcdac8c1d288f
|
Provenance
The following attestation bundles were made for sheetmask-0.1.2.tar.gz:
Publisher:
publish.yml on daniel-butler/sheetmask
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sheetmask-0.1.2.tar.gz -
Subject digest:
9d3f8abdd7748d1170fef9c50ddc2b4a0f6ca84f59ae501db3a4608e2fbfd345 - Sigstore transparency entry: 976510481
- Sigstore integration time:
-
Permalink:
daniel-butler/sheetmask@b5331c8e543f9ca2bf8dc69416a5b063e6315e9a -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/daniel-butler
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@b5331c8e543f9ca2bf8dc69416a5b063e6315e9a -
Trigger Event:
push
-
Statement type:
File details
Details for the file sheetmask-0.1.2-py3-none-any.whl.
File metadata
- Download URL: sheetmask-0.1.2-py3-none-any.whl
- Upload date:
- Size: 20.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f5f8c8f7aa587d48d2606bb0df62e15e4c0198d7cabb8a9dbe6b97cb8c28981a
|
|
| MD5 |
9e11fcb1b7811f39cc9df93fc7458ef5
|
|
| BLAKE2b-256 |
9e59d5c5afd9c08be5bfcc4f4dd0b8b6b66292edf8b9e3a125227876c08e5e0e
|
Provenance
The following attestation bundles were made for sheetmask-0.1.2-py3-none-any.whl:
Publisher:
publish.yml on daniel-butler/sheetmask
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sheetmask-0.1.2-py3-none-any.whl -
Subject digest:
f5f8c8f7aa587d48d2606bb0df62e15e4c0198d7cabb8a9dbe6b97cb8c28981a - Sigstore transparency entry: 976510483
- Sigstore integration time:
-
Permalink:
daniel-butler/sheetmask@b5331c8e543f9ca2bf8dc69416a5b063e6315e9a -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/daniel-butler
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@b5331c8e543f9ca2bf8dc69416a5b063e6315e9a -
Trigger Event:
push
-
Statement type: