High-speed liability scanner for attested vs unattested data using .f33 sidecars.
Project description
fors33-scanner
High-speed file integrity and baseline scanner. Walks one or more roots, measures data gravity (bytes), and classifies large files as attested or unattested based on sibling sidecar presence (.f33, .sig, .asc, .sha256, .sha512, .md5, .pem). Emits checksum baselines (Hash Filename format), CSV, or JSON for use with fors33-verifier.
Trust model: The scanner is an O(1) discovery and liability mapping tool based on sidecar presence only. It does not validate Ed25519 signatures or cryptographic proof of baselines. For full cryptographic verification, use fors33-verifier.
For machine parsing, see LLM_CONTEXT.md.
Install
pip install fors33-scanner
Usage
Scan the current directory (default root) with a 1 MB threshold:
fors33-scanner --threshold-mb 1.0
Scan multiple roots:
fors33-scanner --root /var/log --root /data/telemetry --threshold-mb 10
Emit JSON instead of human output (for CI, pipelines):
fors33-scanner --root /data --json
Fail CI/CD when exposure breaches policy threshold:
fors33-scanner --root /data --max-exposure 5.0 --json
Throttle hashing workers for shared runners:
fors33-scanner --root /data --workers 2
Stream SIEM-ready JSONL events (records + summary):
fors33-scanner --root /data --emit-jsonl -
Depth-limit traversal (0=root only, 1=root + direct children):
fors33-scanner --root /data --max-depth 1
Generate checksum baseline (sha256, sha512, or blake3 per --algo):
fors33-scanner --root /data --emit-checksums fors33_baseline.sha256
fors33-scanner --root /data --algo sha512 --emit-checksums fors33_baseline.sha512
Emit CSV or JSON baseline (compatible with fors33-verifier):
fors33-scanner --root /data --emit-csv fors33_baseline.csv
fors33-scanner --root /data --emit-json fors33_baseline.json
Add compliance exposure text to human output (default is strictly mathematical):
fors33-scanner --root /data --compliance-report
Exit codes
0: successful scan / threshold not breached1: exposure threshold breach (--max-exposure)2: invocation/parameter misuse130: user interrupted scan (Ctrl+C)
Output
Default human output (mathematical only):
[FILE COUNT] : 14,205
[TOTAL BYTES] : 2.1 TB
[ATTESTED] : 48 files, 4.1 GB
[UNATTESTED] : 264 files, 2.1 TB
[ELAPSED] : 4.20s
Safety and scope
- Read-only: does not modify files or sidecars.
- Scan-only: O(1) discovery; baseline generation uses streaming chunked hashing.
- Excludes common dirs (.git, node_modules, venv, etc). Respects .f33ignore and --ignore-pattern / --exclude-dir.
- Legal notice prints to
stderron startup so data/JSON streams onstdoutremain parse-safe. - See
DISCLAIMER.mdfor enterprise legal/regulatory boundaries.
JSONL contract
--emit-jsonl PATHemits one flat JSON object per line.- Multi-root scans include both
root_indexandroot_pathin eachscan_record. timestamprepresents hash completion time.- Final line is
scan_summarywith aggregate stats and scan parameters. - If
--emit-jsonl -and--jsonare both requested, JSONL takes precedence onstdout.
Release model
- Docker publish is manual via
workflow_dispatchwith explicitversionandpush_latestinputs. - Use
v0.4.0style version tags andlatestonly when manually approved.
Requirements
Python 3.9+. Optional blake3 for BLAKE3 hashing. Linux, macOS, Windows.
License
MIT License. See LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fors33_scanner-0.4.0.tar.gz.
File metadata
- Download URL: fors33_scanner-0.4.0.tar.gz
- Upload date:
- Size: 15.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
66740959ff8e4761e800cc426c5364cdaa1e3044c88084e142fc01c9223bd9f2
|
|
| MD5 |
18d48517db1333f009b201f23126afab
|
|
| BLAKE2b-256 |
e555f100de1d4a0174c4cf94a7ff92998abc1f5a45f74d6651ab4aeff0d55e02
|
File details
Details for the file fors33_scanner-0.4.0-py3-none-any.whl.
File metadata
- Download URL: fors33_scanner-0.4.0-py3-none-any.whl
- Upload date:
- Size: 15.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e27ca6c81a4e69bc621d53efc635b9a4020fcbf744ff9bf99750d5923b8930c5
|
|
| MD5 |
b21bca45bf3f56ac055da1005238be94
|
|
| BLAKE2b-256 |
ca8670fc658bf092ab6e5301b11663008e2961914314e18648146f5954b2ab46
|