Cryptographic file inventory and exfiltration detection — powered by CertiSigma
Project description
CertiSigma Census
Cryptographic file inventory and exfiltration detection — powered by CertiSigma.
Census scans directories, computes SHA-256 hashes, attests them via the CertiSigma API (three-layer cryptographic proof: ECDSA T0, qualified TSA T1, Bitcoin T2), and maintains a local manifest. When suspect files surface, Census compares their hashes against the registry to prove — with cryptographic certainty — whether they match inventoried assets.
Installation
pip install certisigma-census
# With watch mode (filesystem monitoring)
pip install certisigma-census[watch]
# With PDF report generation
pip install certisigma-census[report]
# Everything
pip install "certisigma-census[watch,report]"
Requires Python 3.10+. TOML config support on Python 3.10 uses tomli (auto-installed).
Quick Start
1. Inventory scan
export CERTISIGMA_API_KEY=cs_...
# Scan a directory and attest all file hashes
census scan /path/to/sensitive-files --source inventory-hr
# Dry run — hash only, no attestation
census scan /path/to/files --dry-run
# Scan only PDFs and Word docs, skip files over 100 MB
census scan /data --include "*.pdf" --include "*.docx" --max-size 100M
# Resume an interrupted scan
census scan /data --source quarterly --manifest inventory.db --resume
This produces a .census-manifest.db (SQLite) mapping each hash to its file path, size, and attestation metadata.
2. Breach comparison
# Compare suspect files against the CertiSigma registry
census compare /path/to/suspect-files --manifest /path/to/.census-manifest.db
# Save report as JSON or CSV
census compare /suspect --output report.json
census compare /suspect --output report.csv
Exit code: 0 if no matches, 1 if matches found.
3. Manifest status and export
# Show summary
census status /path/to/.census-manifest.db
# Export manifest as CSV for compliance reporting
census export manifest.db --format csv --output inventory.csv
# Export as JSON
census export manifest.db --format json --output inventory.json
4. Evidence verification
# Verify a hash against the CertiSigma registry
census verify a1b2c3d4e5f67890...
# Verify a file (hash it first, then check)
census verify /path/to/document.pdf --file
# Save OpenTimestamps proof
census verify a1b2c3... --save-ots proof.ots
No API key required — all verification endpoints are public.
5. Integrity check
# Check files against manifest baseline
census integrity manifest.db
# Strict mode: exit 1 on any discrepancy
census integrity manifest.db --strict
100% local operation — no API calls, no network needed.
6. Forensic reports
# HTML report (always available, zero dependencies)
census report manifest.db -o report.html
# PDF report (requires: pip install certisigma-census[report])
census report manifest.db -o report.pdf --evidence --integrity
# Evidence bundle: ZIP with report + OTS proofs + checksums
census report manifest.db -o bundle.zip --bundle --evidence
7. Manifest diff
# Compare two manifests
census diff baseline.db current.db
# HTML diff report
census diff baseline.db current.db -o diff.html
# Machine-readable (exit codes: 0=none, 1=added, 2=removed, 4=modified)
census diff baseline.db current.db --json
8. Standalone hashing
# Hash a file
census hash document.pdf
# Hash a directory
census hash /path/to/files
# Verify against known hash
census hash document.pdf --verify a1b2c3d4e5...
9. Attestation tracking
# Check attestation status
census track att_12345
# Wait for Bitcoin anchoring
census track att_12345 --poll --timeout 7200
10. Self-diagnostic
# Run all health checks
census doctor
# Check including a specific manifest
census doctor --manifest inventory.db
# Machine-readable output for CI
census doctor --json
11. Manifest merging
# Merge manifests from different servers
census merge server1.db server2.db -o combined.db
# Merge with glob
census merge scans/*.db -o full-inventory.db --json
12. Audit log
# View all operations
census audit-log show
# Verify hash chain integrity
census audit-log verify
# Machine-readable
census audit-log show --last 10 --json
13. Named snapshots
# Create a compliance baseline
census snapshot create q1-baseline --manifest inventory.db
# List snapshots
census snapshot list
# Compare two snapshots
census snapshot diff q1-baseline q2-baseline
14. Forensic annotation
# Annotate an attestation with case metadata
census annotate att_123 --note "Evidence for case FR-2026-42" --tag "case-2026-001"
# Zero-knowledge mode: encrypt before sending
census annotate att_123 --note "Confidential" --encrypt --encryption-key <key>
# GDPR right-to-erasure
census annotate att_123 --delete
15. Configuration
# Create config template
census config init --project
# View effective config
census config show
# Enable shell completions
eval "$(census completion bash)"
16. Forensic share tokens
# Create a share token (chain of custody)
census share create <att_id> --expires 24h --recipient "Legal Dept" --max-uses 5
# List / inspect / revoke
census share list --json
census share info <token_id>
census share revoke <token_id>
17. Structured tagging
# Tag attestations for classification
census tag set <att_id> -t department=legal -t case=2026-001
# Encrypted tags (zero-knowledge)
census tag set <att_id> -t classification=confidential --encrypt
# Query by tags (AND logic, cursor pagination)
census tag query -f department=legal --limit 50 --json
18. Key rotation
# Rotate encryption key (NIST SP 800-57)
census key-rotate <att_id> --old-key <hex64> --new-key <hex64>
19. Derived lists (third-party breach detection)
# Create an opaque HMAC-SHA256 derived list from your manifest
census derived-list create --manifest ./manifest.db --label "Q1 2026"
# Third party matches their suspects (server never sees plaintext)
census derived-list match <list_id> --list-key <hex64> --hashes-file suspects.txt
# Audit trail
census derived-list access-log <list_id>
20. Metadata read
census metadata get <att_id> --json
census metadata get <att_id> --decrypt --encryption-key <hex64>
21. Watch mode (continuous monitoring)
# Watch a directory for changes and attest new/modified files
census watch /path/to/files --source "production"
# Dry run — hash only, no attestation
census watch /data --dry-run
# Network mount — use polling
census watch /mnt/share --polling --poll-interval 10
Requires: pip install certisigma-census[watch]
How It Works
- Scan — Census walks the directory, computes SHA-256 for each file (streamed, constant memory), and builds a local manifest.
- Attest — Hashes are sent in batches (up to 100 per call) to the CertiSigma API. Each hash receives a three-layer cryptographic proof (T0 ECDSA signature, T1 qualified TSA timestamp, T2 Bitcoin anchor).
- Compare — Suspect files are hashed and verified against the registry via
POST /verify/batch. Matches prove the file was previously inventoried, regardless of filename or directory structure changes.
The original file content never leaves the client. Only SHA-256 hashes are transmitted.
Features
| Feature | Description | Docs |
|---|---|---|
| File filters | --include, --exclude globs; --min-size, --max-size |
scanning.md |
| Resume scans | --resume skips unchanged files, preserves attestation state |
scanning.md |
| CSV/JSON export | Compare reports and manifest export in both formats | comparison.md |
| Retry with backoff | Automatic retry on 429/5xx with exponential backoff | retry-and-resilience.md |
| Structured logging | --log-format json for SIEM/ELK integration |
logging.md |
| Progress bars | Visual feedback for scan, attest, and compare operations | scanning.md |
| SQLite manifest | WAL mode, indexed lookups, auto-migration from JSON | manifest.md |
| Watch mode | Continuous filesystem monitoring with batch attestation | watching.md |
| Evidence verification | Full T0/T1/T2 chain, OTS proof export | evidence.md |
| Integrity check | Tamper detection against manifest baseline | integrity.md |
| Forensic reports | HTML, PDF, evidence bundles (ZIP) | reporting.md |
| Manifest diff | Compare snapshots, AIDE-style exit codes, HTML reports | diff.md |
| Standalone hashing | SHA-256 without manifests or API calls | hash.md |
| Attestation tracking | Monitor T0/T1/T2 progression with --poll |
tracking.md |
| Config files | TOML config with user/project precedence | config.md |
| Shell completions | bash, zsh, fish via census completion |
— |
| Self-diagnostic | API health, config, inotify, manifest integrity | doctor.md |
| Manifest merging | Combine manifests from distributed scans | merge.md |
| JSON output | --json on scan, compare, status, doctor, merge |
— |
| Audit log | Tamper-evident JSONL with SHA-256 hash chain | audit-log.md |
| Named snapshots | Compliance baselines with diff comparison | snapshots.md |
| Forensic annotation | Metadata, tags, case IDs on attestations | annotate.md |
| Zero-knowledge encryption | AES-256-GCM client-side metadata encryption | annotate.md |
| Forensic sharing | Time-limited, use-limited share tokens (chain of custody) | sharing.md |
| Structured tagging | Key-value classification with encrypted tags and query | tagging.md |
| Key rotation | NIST SP 800-57 AES-256 key rotation for metadata + tags | key-rotation.md |
| Derived lists | HMAC-SHA256 opaque third-party breach detection | derived-lists.md |
| Metadata read | Read attestation metadata with optional decryption | — |
Full documentation: docs/features/
CLI Reference
Global options
| Option | Description |
|---|---|
-v / --verbose |
Enable debug logging |
--log-format text|json |
Log output format (default: text) |
--version |
Show version |
census scan
| Option | Description |
|---|---|
--source LABEL |
Source label for attestations |
--manifest PATH |
Manifest output path (default: <dir>/.census-manifest.db) |
--api-key KEY |
API key (or set CERTISIGMA_API_KEY) |
--base-url URL |
Override API base URL |
--dry-run |
Hash only, no attestation |
--resume |
Resume interrupted scan |
--include GLOB |
Include files matching pattern (repeatable) |
--exclude GLOB |
Exclude files matching pattern (repeatable) |
--min-size SIZE |
Skip files smaller than SIZE (e.g. 1K, 10M) |
--max-size SIZE |
Skip files larger than SIZE (default: 5G) |
--json |
Machine-readable JSON summary |
census compare
| Option | Description |
|---|---|
--manifest PATH |
Local manifest for cross-referencing |
--output PATH |
Save report (.json or .csv by extension) |
--include/--exclude/--min-size/--max-size |
Same filters as scan |
--json |
Machine-readable JSON output (stdout only; --output ignored — see stderr note) |
census export
| Option | Description |
|---|---|
--format csv|json |
Output format (default: csv) |
--output PATH |
Output file (default: stdout) |
census verify
| Option | Description |
|---|---|
--file |
Treat argument as a file path (hash it first) |
--save-ots PATH |
Save OTS proof to this path |
--json |
Machine-readable JSON output |
--api-key KEY |
API key (optional for verify) |
--base-url URL |
Override API base URL |
census integrity
| Option | Description |
|---|---|
--json |
Machine-readable JSON output |
--output PATH |
Save results (.csv or .json by extension) |
--strict |
Exit with code 1 on any discrepancy |
census report
| Option | Description |
|---|---|
-o/--output PATH |
Output file (.html, .pdf, or .zip) required |
--evidence |
Fetch T0/T1/T2 evidence chain for attested files |
--integrity |
Run integrity check and include results |
--bundle |
Generate evidence bundle (ZIP) |
--api-key KEY |
API key (needed only with --evidence) |
census status
| Option | Description |
|---|---|
--json |
Machine-readable JSON output |
census doctor
| Option | Description |
|---|---|
--manifest PATH |
Check health of a specific manifest file |
--json |
Machine-readable JSON output |
--api-key KEY |
API key |
--base-url URL |
Override API base URL |
census merge
| Option | Description |
|---|---|
-o/--output PATH |
Output manifest path required |
--json |
Machine-readable JSON summary |
census diff
| Option | Description |
|---|---|
--json |
Machine-readable JSON output |
-o/--output PATH |
Save report (.html, .csv, or .json by extension) |
--summary |
Show only counts, no individual file details |
Exit codes: 0=none, 1=added, 2=removed, 4=modified (bitmask, OR'd together).
census hash
| Option | Description |
|---|---|
--verify HASH |
Compare computed hash against expected SHA-256 |
--json |
Output as JSON array |
census track
| Option | Description |
|---|---|
--poll |
Continuously check until T2 level reached |
--poll-interval SECS |
Seconds between checks (default: 60) |
--timeout SECS |
Max time to poll (default: 3600) |
--json |
Machine-readable JSON output |
--api-key KEY |
API key |
--base-url URL |
Override API base URL |
census config
| Action | Description |
|---|---|
show |
Display effective merged config |
init |
Create a template config file |
paths |
Show config file locations |
--project |
Act on project .census.toml |
census audit-log
| Action | Description |
|---|---|
show |
Display audit log entries |
verify |
Check hash chain integrity |
clear |
Delete the audit log file |
--log-path PATH |
Override audit log file path |
--last N |
Show only last N entries (with show) |
--json |
Machine-readable JSON output |
census snapshot
| Action | Description |
|---|---|
create <name> |
Save a named snapshot of a manifest |
list |
List all snapshots |
diff <name1> <name2> |
Compare two snapshots |
delete <name> |
Remove a snapshot |
--manifest PATH |
Manifest to snapshot (required for create) |
--snapshot-dir PATH |
Override snapshot directory |
--json |
Machine-readable JSON output |
census annotate
| Option | Description |
|---|---|
--note TEXT |
Free-text note |
--tag TEXT |
Tag label (e.g. case number) |
--case-id TEXT |
Forensic case identifier |
--source TEXT |
Update source label |
--delete |
Soft-delete metadata (GDPR) |
--encrypt |
Encrypt client-side (AES-256-GCM) |
--encryption-key HEX |
64-char hex AES-256 key |
--decrypt |
Decrypt and display stored metadata |
--json |
Machine-readable JSON output |
--api-key KEY |
API key |
census share
| Action / Option | Description |
|---|---|
create <att_id>... |
Create share token for attestation(s) |
list |
List all share tokens |
info <token_id> |
Inspect a specific token |
revoke <token_id> |
Revoke a token |
--expires DURATION |
Token lifetime: 30m, 24h, 7d (default: 24h) |
--recipient TEXT |
Recipient label |
--max-uses N |
Max usage count |
--json |
Machine-readable JSON output |
census tag
| Action / Option | Description |
|---|---|
set <att_id> |
Set tags (requires -t key=value) |
get <att_id> |
List tags on an attestation |
delete <att_id> <key> |
Delete a specific tag |
query |
Query attestations by tag filter |
-t, --tag key=value |
Tag pair (repeatable) |
-f, --filter key=value |
Query filter (repeatable, AND logic) |
--encrypt |
Encrypt tag values (AES-256-GCM) |
--decrypt |
Decrypt on get |
--limit N |
Max query results (default: 100) |
--cursor TOKEN |
Pagination cursor |
--json |
Machine-readable JSON output |
census key-rotate
| Option | Description |
|---|---|
<attestation_id> |
Target attestation |
--old-key HEX |
Current 64-char hex AES-256 key |
--new-key HEX |
New 64-char hex AES-256 key |
--json |
Machine-readable JSON output |
census derived-list
| Action / Option | Description |
|---|---|
create |
Create HMAC-SHA256 derived list |
list |
List all derived lists |
info <list_id> |
Get list details |
match <list_id> |
Match suspect hashes against list |
access-log <list_id> |
View access audit trail |
revoke <list_id> |
Revoke a list |
--manifest PATH |
Manifest to read hashes from |
--tag-filter JSON |
JSON tag filter for server-side selection |
--label TEXT |
Human-readable label |
--expires HOURS |
Expiry in hours (max 2160) |
--list-key HEX |
HMAC key (64 hex chars) for match |
--hashes-file PATH |
File with one hash per line for match |
--json |
Machine-readable JSON output |
census metadata
| Action / Option | Description |
|---|---|
get <att_id> |
Read attestation metadata |
--decrypt |
Decrypt encrypted extra_data |
--encryption-key HEX |
64-char hex AES-256 key |
--json |
Machine-readable JSON output |
census completion
Takes a shell name: bash, zsh, or fish.
eval "$(census completion bash)" # bash
eval "$(census completion zsh)" # zsh
census completion fish | source # fish
census watch
| Option | Description |
|---|---|
--debounce SECS |
Quiet period before processing (default: 2.0s) |
--batch-interval SECS |
Max time between attestation batches (default: 30s) |
--scan-on-start / --no-scan-on-start |
Baseline scan before watching (default: on) |
--on-delete ignore|mark|remove |
Action on file deletion (default: ignore) |
--polling |
Use PollingObserver for NFS/CIFS mounts |
--poll-interval SECS |
Polling interval (default: 5s) |
--source/--manifest/--api-key/--dry-run |
Same as census scan |
--include/--exclude/--min-size/--max-size |
Same filters as scan |
Requires: pip install certisigma-census[watch]
Dependencies
certisigma— Official CertiSigma Python SDKclick— CLI framework
Optional:
watchdog— Filesystem monitoring (only forcensus watch)fpdf2— PDF report generation (only forcensus reportwith.pdfoutput)
License
MIT — Ten Sigma Sagl
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file certisigma_census-0.8.0.tar.gz.
File metadata
- Download URL: certisigma_census-0.8.0.tar.gz
- Upload date:
- Size: 136.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8f83a40236b459c70f54f4e2cbda5263cd99cd70d9d08f9d57a0cdd49aca7954
|
|
| MD5 |
8455cbedb806615da79e95cdb93be933
|
|
| BLAKE2b-256 |
10887c723699785ee14550df970b7cf7345c8dad9b326bfc38347c928c433dda
|
Provenance
The following attestation bundles were made for certisigma_census-0.8.0.tar.gz:
Publisher:
publish.yml on massimocavallin/certisigma-census
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
certisigma_census-0.8.0.tar.gz -
Subject digest:
8f83a40236b459c70f54f4e2cbda5263cd99cd70d9d08f9d57a0cdd49aca7954 - Sigstore transparency entry: 1141550825
- Sigstore integration time:
-
Permalink:
massimocavallin/certisigma-census@6ad74a2085ad7e28323594f9f3858f2486c2775d -
Branch / Tag:
refs/tags/v0.8.0 - Owner: https://github.com/massimocavallin
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@6ad74a2085ad7e28323594f9f3858f2486c2775d -
Trigger Event:
push
-
Statement type:
File details
Details for the file certisigma_census-0.8.0-py3-none-any.whl.
File metadata
- Download URL: certisigma_census-0.8.0-py3-none-any.whl
- Upload date:
- Size: 69.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dcecd98b1f0be1e06021f2edf5ff7633e2c4638481f1e1c3895f2967b209aebb
|
|
| MD5 |
cb644ae75e5e5235f034aba1d0a99cce
|
|
| BLAKE2b-256 |
31b52948baf4dc4ceec77b19e96e5c959d1c7d33dc2dd9bcbad161161effd8ad
|
Provenance
The following attestation bundles were made for certisigma_census-0.8.0-py3-none-any.whl:
Publisher:
publish.yml on massimocavallin/certisigma-census
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
certisigma_census-0.8.0-py3-none-any.whl -
Subject digest:
dcecd98b1f0be1e06021f2edf5ff7633e2c4638481f1e1c3895f2967b209aebb - Sigstore transparency entry: 1141550912
- Sigstore integration time:
-
Permalink:
massimocavallin/certisigma-census@6ad74a2085ad7e28323594f9f3858f2486c2775d -
Branch / Tag:
refs/tags/v0.8.0 - Owner: https://github.com/massimocavallin
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@6ad74a2085ad7e28323594f9f3858f2486c2775d -
Trigger Event:
push
-
Statement type: