Skip to main content

High-speed liability scanner for attested vs unattested data using .f33 sidecars.

Project description

fors33-scanner

CI Release PyPI Docker Tag Docker Pulls License

High-speed file integrity and baseline scanner. Walks one or more roots, measures data gravity (bytes), and classifies large files as attested or unattested based on sibling sidecar presence (.f33, .sig, .asc, .sha256, .sha512, .md5, .pem). Emits checksum baselines (Hash Filename format), CSV, or JSON for use with fors33-verifier.

Trust model: The scanner is an O(1) discovery and liability mapping tool based on sidecar presence only. It does not validate Ed25519 signatures or cryptographic proof of baselines. For full cryptographic verification, use fors33-verifier.

For machine parsing, see LLM_CONTEXT.md.

Install

pip install fors33-scanner

Usage

Scan the current directory (default root) with a 1 MB threshold:

fors33-scanner --threshold-mb 1.0

Scan multiple roots:

fors33-scanner --root /var/log --root /data/telemetry --threshold-mb 10

Emit JSON instead of human output (for CI, pipelines):

fors33-scanner --root /data --json

Fail CI/CD when exposure breaches policy threshold:

fors33-scanner --root /data --max-exposure 5.0 --json

Throttle hashing workers for shared runners:

fors33-scanner --root /data --workers 2

Stream SIEM-ready JSONL events (records + summary):

fors33-scanner --root /data --emit-jsonl -

Depth-limit traversal (0=root only, 1=root + direct children):

fors33-scanner --root /data --max-depth 1

Strict audit (fail on permission or file-lock errors instead of skipping):

fors33-scanner --root /data --strict-audit

Record TSA endpoint for tooling that reads FORS33_TSA_URL:

fors33-scanner --tsa-url https://tsa.example.com/rfc3161

FORS33_WORKERS overrides --workers after CLI parse. Default worker count follows FORS33_EXTENSION_MODE (4 workers) or min(32, cpu+4); explicit positive values are capped at 64.

Generate checksum baseline (sha256, sha512, or blake3 per --algo):

fors33-scanner --root /data --emit-checksums fors33_baseline.sha256
fors33-scanner --root /data --algo sha512 --emit-checksums fors33_baseline.sha512

Emit CSV or JSON baseline (compatible with fors33-verifier):

fors33-scanner --root /data --emit-csv fors33_baseline.csv
fors33-scanner --root /data --emit-json fors33_baseline.json

Add compliance exposure text to human output (default is strictly mathematical):

fors33-scanner --root /data --compliance-report

Exit codes

  • 0: successful scan / threshold not breached
  • 1: exposure threshold breach (--max-exposure)
  • 2: invocation/parameter misuse, or --strict-audit I/O access failure
  • 130: user interrupted scan (Ctrl+C)

Output

Default human output (mathematical only):

[FILE COUNT]    : 14,205
[TOTAL BYTES]   : 2.1 TB
[ATTESTED]      : 48 files, 4.1 GB
[UNATTESTED]    : 264 files, 2.1 TB
[ELAPSED]       : 4.20s

Safety and scope

  • Read-only: does not modify files or sidecars.
  • Scan-only: O(1) discovery; baseline generation uses streaming chunked hashing.
  • Excludes common dirs (.git, node_modules, venv, etc). Respects .f33ignore and --ignore-pattern / --exclude-dir.
  • Legal notice prints to stderr on startup so data/JSON streams on stdout remain parse-safe.
  • See DISCLAIMER.md for enterprise legal/regulatory boundaries.

JSONL contract

  • --emit-jsonl PATH emits one flat JSON object per line.
  • Multi-root scans include both root_index and root_path in each scan_record.
  • timestamp represents hash completion time.
  • Final line is scan_summary with aggregate stats and scan parameters.
  • If --emit-jsonl - and --json are both requested, JSONL takes precedence on stdout.

Release model

  • Docker publish is manual via workflow_dispatch with explicit version and push_latest inputs.
  • Use v0.5.0 style version tags and latest only when manually approved.

Requirements

Python 3.9+. Optional blake3 for BLAKE3 hashing. Linux, macOS, Windows.

License

MIT License. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fors33_scanner-0.5.0.tar.gz (16.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fors33_scanner-0.5.0-py3-none-any.whl (16.7 kB view details)

Uploaded Python 3

File details

Details for the file fors33_scanner-0.5.0.tar.gz.

File metadata

  • Download URL: fors33_scanner-0.5.0.tar.gz
  • Upload date:
  • Size: 16.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for fors33_scanner-0.5.0.tar.gz
Algorithm Hash digest
SHA256 7df43293c8f6dd71290ef43713a778e44849bb508da6ce0c34a9c9496e86fc51
MD5 7c7d9cc82e360be134a4654a0e1f2c6f
BLAKE2b-256 d6334685b125707f2e00fd46338928e2e6c1a82530f9b635d14f24bd912c6b22

See more details on using hashes here.

File details

Details for the file fors33_scanner-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: fors33_scanner-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 16.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for fors33_scanner-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ff952d845b6249d6d4efabbcffd3a43ce0bd22e226c9a6c8cd8f378a9c735c39
MD5 b658b94fee48f4f30b9a1250372542dd
BLAKE2b-256 7af42057959f1c510cf7bd826128b4e536854233c3cee2ea9fc1a85b7de21754

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page