Skip to main content

Cryptographic file inventory and exfiltration detection — powered by CertiSigma

Project description

CertiSigma Census

Cryptographic file inventory and exfiltration detection — powered by CertiSigma.

Census scans directories, computes SHA-256 hashes, attests them via the CertiSigma API (three-layer cryptographic proof: ECDSA T0, qualified TSA T1, Bitcoin T2), and maintains a local manifest. When suspect files surface, Census compares their hashes against the registry to prove — with cryptographic certainty — whether they match inventoried assets.

Installation

pip install certisigma-census

# With watch mode (filesystem monitoring)
pip install certisigma-census[watch]

# With PDF report generation
pip install certisigma-census[report]

# Everything
pip install "certisigma-census[watch,report]"

Requires Python 3.10+. TOML config support on Python 3.10 uses tomli (auto-installed).

Quick Start

1. Inventory scan

export CERTISIGMA_API_KEY=cs_...

# Scan a directory and attest all file hashes
census scan /path/to/sensitive-files --source inventory-hr

# Dry run — hash only, no attestation
census scan /path/to/files --dry-run

# Scan only PDFs and Word docs, skip files over 100 MB
census scan /data --include "*.pdf" --include "*.docx" --max-size 100M

# Resume an interrupted scan
census scan /data --source quarterly --manifest inventory.db --resume

This produces a .census-manifest.db (SQLite) mapping each hash to its file path, size, and attestation metadata.

2. Breach comparison

# Compare suspect files against the CertiSigma registry
census compare /path/to/suspect-files --manifest /path/to/.census-manifest.db

# Save report as JSON or CSV
census compare /suspect --output report.json
census compare /suspect --output report.csv

Exit code: 0 if no matches, 1 if matches found.

3. Manifest status and export

# Show summary
census status /path/to/.census-manifest.db

# Export manifest as CSV for compliance reporting
census export manifest.db --format csv --output inventory.csv

# Export as JSON
census export manifest.db --format json --output inventory.json

4. Evidence verification

# Verify a hash against the CertiSigma registry
census verify a1b2c3d4e5f67890...

# Verify a file (hash it first, then check)
census verify /path/to/document.pdf --file

# Save OpenTimestamps proof
census verify a1b2c3... --save-ots proof.ots

No API key required — all verification endpoints are public.

5. Integrity check

# Check files against manifest baseline
census integrity manifest.db

# Strict mode: exit 1 on any discrepancy
census integrity manifest.db --strict

100% local operation — no API calls, no network needed.

6. Forensic reports

# HTML report (always available, zero dependencies)
census report manifest.db -o report.html

# PDF report (requires: pip install certisigma-census[report])
census report manifest.db -o report.pdf --evidence --integrity

# Evidence bundle: ZIP with report + OTS proofs + checksums
census report manifest.db -o bundle.zip --bundle --evidence

7. Manifest diff

# Compare two manifests
census diff baseline.db current.db

# HTML diff report
census diff baseline.db current.db -o diff.html

# Machine-readable (exit codes: 0=none, 1=added, 2=removed, 4=modified)
census diff baseline.db current.db --json

8. Standalone hashing

# Hash a file
census hash document.pdf

# Hash a directory
census hash /path/to/files

# Verify against known hash
census hash document.pdf --verify a1b2c3d4e5...

9. Attestation tracking

# Check attestation status
census track att_12345

# Wait for Bitcoin anchoring
census track att_12345 --poll --timeout 7200

10. Self-diagnostic

# Run all health checks
census doctor

# Check including a specific manifest
census doctor --manifest inventory.db

# Machine-readable output for CI
census doctor --json

11. Manifest merging

# Merge manifests from different servers
census merge server1.db server2.db -o combined.db

# Merge with glob
census merge scans/*.db -o full-inventory.db --json

12. Audit log

# View all operations
census audit-log show

# Verify hash chain integrity
census audit-log verify

# Machine-readable
census audit-log show --last 10 --json

13. Named snapshots

# Create a compliance baseline
census snapshot create q1-baseline --manifest inventory.db

# List snapshots
census snapshot list

# Compare two snapshots
census snapshot diff q1-baseline q2-baseline

14. Forensic annotation

# Annotate an attestation with case metadata
census annotate att_123 --note "Evidence for case FR-2026-42" --tag "case-2026-001"

# Zero-knowledge mode: encrypt before sending
census annotate att_123 --note "Confidential" --encrypt --encryption-key <key>

# GDPR right-to-erasure
census annotate att_123 --delete

15. Configuration

# Create config template
census config init --project

# View effective config
census config show

# Enable shell completions
eval "$(census completion bash)"

16. Watch mode (continuous monitoring)

# Watch a directory for changes and attest new/modified files
census watch /path/to/files --source "production"

# Dry run — hash only, no attestation
census watch /data --dry-run

# Network mount — use polling
census watch /mnt/share --polling --poll-interval 10

Requires: pip install certisigma-census[watch]

How It Works

  1. Scan — Census walks the directory, computes SHA-256 for each file (streamed, constant memory), and builds a local manifest.
  2. Attest — Hashes are sent in batches (up to 100 per call) to the CertiSigma API. Each hash receives a three-layer cryptographic proof (T0 ECDSA signature, T1 qualified TSA timestamp, T2 Bitcoin anchor).
  3. Compare — Suspect files are hashed and verified against the registry via POST /verify/batch. Matches prove the file was previously inventoried, regardless of filename or directory structure changes.

The original file content never leaves the client. Only SHA-256 hashes are transmitted.

Features

Feature Description Docs
File filters --include, --exclude globs; --min-size, --max-size scanning.md
Resume scans --resume skips unchanged files, preserves attestation state scanning.md
CSV/JSON export Compare reports and manifest export in both formats comparison.md
Retry with backoff Automatic retry on 429/5xx with exponential backoff retry-and-resilience.md
Structured logging --log-format json for SIEM/ELK integration logging.md
Progress bars Visual feedback for scan, attest, and compare operations scanning.md
SQLite manifest WAL mode, indexed lookups, auto-migration from JSON manifest.md
Watch mode Continuous filesystem monitoring with batch attestation watching.md
Evidence verification Full T0/T1/T2 chain, OTS proof export evidence.md
Integrity check Tamper detection against manifest baseline integrity.md
Forensic reports HTML, PDF, evidence bundles (ZIP) reporting.md
Manifest diff Compare snapshots, AIDE-style exit codes, HTML reports diff.md
Standalone hashing SHA-256 without manifests or API calls hash.md
Attestation tracking Monitor T0/T1/T2 progression with --poll tracking.md
Config files TOML config with user/project precedence config.md
Shell completions bash, zsh, fish via census completion
Self-diagnostic API health, config, inotify, manifest integrity doctor.md
Manifest merging Combine manifests from distributed scans merge.md
JSON output --json on scan, compare, status, doctor, merge
Audit log Tamper-evident JSONL with SHA-256 hash chain audit-log.md
Named snapshots Compliance baselines with diff comparison snapshots.md
Forensic annotation Metadata, tags, case IDs on attestations annotate.md
Zero-knowledge encryption AES-256-GCM client-side metadata encryption annotate.md

Full documentation: docs/features/

CLI Reference

Global options

Option Description
-v / --verbose Enable debug logging
--log-format text|json Log output format (default: text)
--version Show version

census scan

Option Description
--source LABEL Source label for attestations
--manifest PATH Manifest output path (default: <dir>/.census-manifest.db)
--api-key KEY API key (or set CERTISIGMA_API_KEY)
--base-url URL Override API base URL
--dry-run Hash only, no attestation
--resume Resume interrupted scan
--include GLOB Include files matching pattern (repeatable)
--exclude GLOB Exclude files matching pattern (repeatable)
--min-size SIZE Skip files smaller than SIZE (e.g. 1K, 10M)
--max-size SIZE Skip files larger than SIZE (default: 5G)
--json Machine-readable JSON summary

census compare

Option Description
--manifest PATH Local manifest for cross-referencing
--output PATH Save report (.json or .csv by extension)
--include/--exclude/--min-size/--max-size Same filters as scan
--json Machine-readable JSON output (stdout only; --output ignored — see stderr note)

census export

Option Description
--format csv|json Output format (default: csv)
--output PATH Output file (default: stdout)

census verify

Option Description
--file Treat argument as a file path (hash it first)
--save-ots PATH Save OTS proof to this path
--json Machine-readable JSON output
--api-key KEY API key (optional for verify)
--base-url URL Override API base URL

census integrity

Option Description
--json Machine-readable JSON output
--output PATH Save results (.csv or .json by extension)
--strict Exit with code 1 on any discrepancy

census report

Option Description
-o/--output PATH Output file (.html, .pdf, or .zip) required
--evidence Fetch T0/T1/T2 evidence chain for attested files
--integrity Run integrity check and include results
--bundle Generate evidence bundle (ZIP)
--api-key KEY API key (needed only with --evidence)

census status

Option Description
--json Machine-readable JSON output

census doctor

Option Description
--manifest PATH Check health of a specific manifest file
--json Machine-readable JSON output
--api-key KEY API key
--base-url URL Override API base URL

census merge

Option Description
-o/--output PATH Output manifest path required
--json Machine-readable JSON summary

census diff

Option Description
--json Machine-readable JSON output
-o/--output PATH Save report (.html, .csv, or .json by extension)
--summary Show only counts, no individual file details

Exit codes: 0=none, 1=added, 2=removed, 4=modified (bitmask, OR'd together).

census hash

Option Description
--verify HASH Compare computed hash against expected SHA-256
--json Output as JSON array

census track

Option Description
--poll Continuously check until T2 level reached
--poll-interval SECS Seconds between checks (default: 60)
--timeout SECS Max time to poll (default: 3600)
--json Machine-readable JSON output
--api-key KEY API key
--base-url URL Override API base URL

census config

Action Description
show Display effective merged config
init Create a template config file
paths Show config file locations
--project Act on project .census.toml

census audit-log

Action Description
show Display audit log entries
verify Check hash chain integrity
clear Delete the audit log file
--log-path PATH Override audit log file path
--last N Show only last N entries (with show)
--json Machine-readable JSON output

census snapshot

Action Description
create <name> Save a named snapshot of a manifest
list List all snapshots
diff <name1> <name2> Compare two snapshots
delete <name> Remove a snapshot
--manifest PATH Manifest to snapshot (required for create)
--snapshot-dir PATH Override snapshot directory
--json Machine-readable JSON output

census annotate

Option Description
--note TEXT Free-text note
--tag TEXT Tag label (e.g. case number)
--case-id TEXT Forensic case identifier
--source TEXT Update source label
--delete Soft-delete metadata (GDPR)
--encrypt Encrypt client-side (AES-256-GCM)
--encryption-key HEX 64-char hex AES-256 key
--decrypt Decrypt and display stored metadata
--json Machine-readable JSON output
--api-key KEY API key

census completion

Takes a shell name: bash, zsh, or fish.

eval "$(census completion bash)"   # bash
eval "$(census completion zsh)"    # zsh
census completion fish | source    # fish

census watch

Option Description
--debounce SECS Quiet period before processing (default: 2.0s)
--batch-interval SECS Max time between attestation batches (default: 30s)
--scan-on-start / --no-scan-on-start Baseline scan before watching (default: on)
--on-delete ignore|mark|remove Action on file deletion (default: ignore)
--polling Use PollingObserver for NFS/CIFS mounts
--poll-interval SECS Polling interval (default: 5s)
--source/--manifest/--api-key/--dry-run Same as census scan
--include/--exclude/--min-size/--max-size Same filters as scan

Requires: pip install certisigma-census[watch]

Dependencies

Optional:

  • watchdog — Filesystem monitoring (only for census watch)
  • fpdf2 — PDF report generation (only for census report with .pdf output)

License

MIT — Ten Sigma Sagl

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

certisigma_census-0.7.0.tar.gz (116.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

certisigma_census-0.7.0-py3-none-any.whl (56.1 kB view details)

Uploaded Python 3

File details

Details for the file certisigma_census-0.7.0.tar.gz.

File metadata

  • Download URL: certisigma_census-0.7.0.tar.gz
  • Upload date:
  • Size: 116.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for certisigma_census-0.7.0.tar.gz
Algorithm Hash digest
SHA256 e4aeaeffc497fa98d1e037563189b54afc3ae390dd485de4c29a0e6d2b2a907e
MD5 5ce3c4aebfeb52bc2a24432e0136b6fc
BLAKE2b-256 146e590294e42bdaf72fde89bba542b016964e06c894d1ec9e43f6dcaac87072

See more details on using hashes here.

Provenance

The following attestation bundles were made for certisigma_census-0.7.0.tar.gz:

Publisher: publish.yml on massimocavallin/certisigma-census

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file certisigma_census-0.7.0-py3-none-any.whl.

File metadata

File hashes

Hashes for certisigma_census-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0a1612685e9f487010cbcaa2799727c7e7ddaa0eba6e688acb819e446e972443
MD5 3b1d29d853773dd2a7031bdd874415fa
BLAKE2b-256 11168238b8f56dec8d1b8b07d41af53acbc1cbe1acc922071558d9dfc910964c

See more details on using hashes here.

Provenance

The following attestation bundles were made for certisigma_census-0.7.0-py3-none-any.whl:

Publisher: publish.yml on massimocavallin/certisigma-census

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page