AI content authenticity detection: C2PA metadata, passive watermarks, and image forensics with heatmap visualization
Project description
authentica ๐
AI content authenticity detection for Python.
Detect C2PA content credentials, invisible watermarks, and image forensics anomalies โ all in one pip install.
Inspired by ExifTool's philosophy: never trust the extension, always read the bytes.
What it does
| Capability | What is detected | How |
|---|---|---|
| C2PA / Content Credentials | Manifest presence, COSE signature validity, all assertions (creator, AI gen, training rights, actions) | JUMBF box parsing + CBOR decoding of JPEG APP11 / PNG iTXt chunks |
| Passive watermark detection | Invisible frequency-domain watermarks (DWT, DCT, FFT-peak analysis) | Blind detection โ no original image needed |
| Image forensics | ELA inconsistency, camera noise anomalies, GAN/diffusion grid artifacts | Error Level Analysis + noise residual + FFT cross-peak detection |
| Heatmap output | Visual per-pixel influence maps | numpy + matplotlib |
Install
pip install authentica
For full heatmap support:
pip install "authentica[docs]"
Quick start
Python API
from authentica import scan
result = scan("photo.jpg")
print(result.summary())
# [photo.jpg] type=image/jpeg trust=72/100 c2pa=โ watermark=โ
# C2PA manifest
if result.has_c2pa:
for assertion in result.c2pa.assertions:
print(assertion.label, "โ", assertion.description)
# Watermark heatmap
if result.has_watermark:
result.watermark.save_heatmap("watermark_heatmap.png")
# Forensics heatmap
result.forensics.save_ela_heatmap("ela.png")
# Full JSON output
import json
print(json.dumps(result.to_dict(), indent=2))
Individual analyzers
from authentica.c2pa import C2PAReader
from authentica.watermark import WatermarkDetector
from authentica.forensics import ForensicsAnalyzer
# C2PA only
c2pa = C2PAReader().read("photo.jpg")
print(c2pa.claim_generator) # e.g. "Adobe Photoshop 25.0"
print(c2pa.signature_valid) # True / False
# Watermark only
wm = WatermarkDetector().detect("photo.jpg")
print(f"Watermark: {wm.detected} confidence: {wm.confidence:.0%}")
wm.save_heatmap("wm.png")
# Forensics only
forensics = ForensicsAnalyzer().analyze("photo.jpg")
print(f"Anomaly score: {forensics.anomaly_score:.0%}")
forensics.save_ela_heatmap("ela.png")
forensics.save_noise_heatmap("noise.png")
CLI
# Full scan
authentica scan photo.jpg
# Full scan with JSON output
authentica scan photo.jpg --json
# Scan and save watermark heatmap
authentica scan photo.jpg --heatmap wm.png
# C2PA only
authentica c2pa photo.jpg
# Watermark only, save heatmap
authentica watermark photo.jpg --save-heatmap wm.png
# Forensics only, save both heatmaps
authentica forensics photo.jpg --save-ela ela.png --save-noise noise.png
Architecture
authentica/
โโโ core.py # scan() โ unifies all analyzers
โโโ c2pa/
โ โโโ reader.py # JUMBF/CBOR C2PA manifest parser
โโโ watermark/
โ โโโ detector.py # DCT + DWT + FFT watermark detection
โโโ forensics/
โ โโโ analyzer.py # ELA + noise residual + frequency forensics
โโโ utils/
โ โโโ file_type.py # Magic-byte file type detection
โโโ cli/
โโโ main.py # Rich CLI (Click + Rich)
Data Flow
flowchart TD
A["User calls scan(path, options)"] --> B["Detect file type using magic bytes"]
B --> C{"File is image?"}
C -->|Yes| D["Run Watermark Detector"]
C -->|No| E["Skip Watermark"]
D --> F["Run Forensics Analyzer"]
E --> F
B --> G{"Run C2PA enabled?"}
G -->|Yes| H["Run C2PA Reader"]
G -->|No| I["Skip C2PA"]
H --> J["Aggregate Results"]
I --> J
F --> J
J --> K["Compute Trust Score"]
K --> L["Return ScanResult"]
%% Subprocesses
subgraph "Watermark Detection"
D1["DCT anomaly scan"]
D2["DWT energy analysis"]
D3["FFT spectrum peaks"]
D --> D1
D1 --> D2
D2 --> D3
end
subgraph "Forensics Analysis"
F1["Error Level Analysis"]
F2["Noise residual analysis"]
F3["Frequency domain anomaly"]
F --> F1
F1 --> F2
F2 --> F3
end
subgraph "C2PA Processing"
H1["Extract JUMBF boxes"]
H2["Parse CBOR payloads"]
H3["Decode claims/assertions"]
H4["Map to ExifTool tags"]
H --> H1
H1 --> H2
H2 --> H3
H3 --> H4
end
subgraph "Result Aggregation"
J1["Collect analyzer outputs"]
J2["Handle errors gracefully"]
J3["Create ScanResult object"]
J --> J1
J1 --> J2
J2 --> J3
end
The main flow follows this pattern:
- File type detection using magic bytes (never trusts extensions)
- Conditional analyzer execution based on file type and user options
- Result aggregation into a unified
ScanResultwith trust scoring - Error handling that gracefully degrades when individual analyzers fail
How it compares to ExifTool
| ExifTool (Perl) | Authentica (Python) |
|---|---|
| Reads metadata from any file via magic bytes | Same โ detects file type from bytes, never extension |
| Walks binary container formats (EXIF, XMP, IPTC) | Walks JUMBF boxes, decodes CBOR C2PA structures |
| Outputs structured key-value metadata | Outputs structured Python dataclasses + JSON |
CLI: exiftool photo.jpg |
CLI: authentica scan photo.jpg |
| 50k lines of Perl | Pure Python, typed, tested |
C2PA manifest parsing โ how it works
C2PA stores a manifest store as a JUMBF (ISO 19566-5) container:
JPEG file
โโโ APP11 markers (0xFFEB) โ concatenated in sequence order
โโโ JUMBF superbox (TBox='jumb')
โโโ JUMBF description box (TBox='jumd', UUID=c2pa UUID)
โโโ CBOR payload โ the manifest store
โโโ active_manifest reference
โโโ manifests map
โโโ claim (CBOR)
โโโ claim_generator
โโโ assertions[] โ JUMBF URI references
โโโ COSE_Sign1 signature
Supported formats
| Format | C2PA | Watermark | Forensics |
|---|---|---|---|
| JPEG | โ | โ | โ |
| PNG | โ | โ | โ |
| WebP | โ | โ | โ |
| TIFF | โ | โ | โ |
| โ (basic) | โ | โ | |
| MP4/MOV | planned | โ | โ |
Development
# Clone and install in dev mode
git clone https://github.com/yourusername/authentica
cd authentica
pip install -e ".[dev]"
# Run tests
pytest
# Lint + format
ruff check src/
ruff format src/
Contributing
Contributions welcome! See CONTRIBUTING.md.
Ideas for future modules:
authentica.llmโ LLM-generated text detection (perplexity, KGW watermark)authentica.videoโ frame-level C2PA + watermark analysisauthentica.synthidโ Google SynthID watermark detection
License
MIT โ see LICENSE.
Acknowledgements
- C2PA Specification โ Coalition for Content Provenance and Authenticity
- ExifTool โ Phil Harvey's metadata reader (inspiration for the magic-byte approach)
- invisible-watermark โ DCT/DWT watermark methods
- Content Authenticity Initiative
v0.2.0 โ ExifTool feature parity update
Inspired by studying the ExifTool source (Perl), this release adds full metadata reading and cross-platform support.
New commands
# Read all metadata (like: exiftool FILE)
authentica meta photo.jpg
authentica meta photo.jpg --json # machine-readable JSON
authentica meta photo.jpg --csv # CSV row
authentica meta photo.jpg --gps-dms # GPS as degยฐmin'sec"
# Compare metadata between two files (like: exiftool -diff FILE1 FILE2)
authentica diff original.jpg edited.jpg
authentica diff original.jpg edited.jpg --json
# Batch scan a directory (like: exiftool -r -json DIR)
authentica scan-dir /photos --json
authentica scan-dir /photos --csv --out metadata.csv
authentica scan-dir /photos --ext jpg --ext png --progress
# Extract embedded thumbnail (like: exiftool -ThumbnailImage -b FILE)
authentica thumbnail photo.jpg --out thumb.jpg
# Platform info (like: exiftool -ver -v)
authentica version --verbose
New Python API
from authentica import MetadataReader, diff_metadata, BatchScanner
# Full metadata read โ EXIF, IPTC, XMP, GPS, ICC, ID3, QuickTime
meta = MetadataReader().read("photo.jpg")
print(meta.exif["Make"], meta.exif["Model"])
print(meta.gps.coord_format("dms")) # 41ยฐ 53' 32.00" N, 12ยฐ 29' 24.00" E
print(meta.composite["Aperture"]) # f/2.8
print(meta.composite["ShutterSpeed"]) # 1/500
print(meta.md5, meta.sha256) # file integrity hashes
# Metadata diff
diff = diff_metadata("original.jpg", "edited.jpg")
print(diff.summary())
for entry in diff.changed:
print(f"{entry.tag}: {entry.value_a!r} โ {entry.value_b!r}")
# Batch scan
scanner = BatchScanner(extensions={".jpg", ".png"}, recurse=True, progress=True)
for path in scanner.walk("/photos"):
meta = MetadataReader().read(path)
print(path.name, meta.gps.coord_format() if meta.gps else "no GPS")
What we took from ExifTool (and improved)
| ExifTool feature | Authentica equivalent |
|---|---|
exiftool FILE โ read all metadata |
authentica meta FILE |
| EXIF: Make, Model, GPS, Aperture, ISOโฆ | MetadataReader โ full EXIF + composite |
GPS decimal + deg/min/sec formatting (-c) |
gps.coord_format("decimal"/"dms") |
| IPTC: Creator, Keywords, Caption | Parsed from APP13 IRB |
| XMP: Dublin Core, photoshop:, xmpRights: | Parsed from APP1/iTXt |
| ID3 tags (MP3) | meta.id3["Title"], ["Artist"]โฆ |
| QuickTime/MP4 atoms | meta.quicktime["Title"]โฆ |
| ICC color profile | meta.icc["ProfileDescription"] |
| MD5Sum, SHA256Sum composite tags | meta.md5, meta.sha256 |
| Composite: Aperture, ShutterSpeed, LV | meta.composite["Aperture"] etc. |
exiftool -diff FILE1 FILE2 |
diff_metadata(a, b) |
exiftool -r -ext jpg DIR |
BatchScanner(extensions={".jpg"}, recurse=True) |
exiftool -csv DIR |
results_to_csv(results) |
exiftool -ThumbnailImage -b FILE |
extract_thumbnail(file) |
| Cross-platform (Perl+DLLs on Windows) | Pure Python โ works on Linux/macOS/Windows |
-ver -v platform info |
authentica version --verbose |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file authentica-0.2.2.tar.gz.
File metadata
- Download URL: authentica-0.2.2.tar.gz
- Upload date:
- Size: 2.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
13a7e98614c45180a816f9a1a5bbec363e5e26a0053e0e9e79edc611fc99a9c9
|
|
| MD5 |
eec05852446757822a1d84f4ea5ac7c5
|
|
| BLAKE2b-256 |
590ba31f6d373e0ac74fbd142f4c245f8809ab86554897f6563064f66eaa9195
|
File details
Details for the file authentica-0.2.2-py3-none-any.whl.
File metadata
- Download URL: authentica-0.2.2-py3-none-any.whl
- Upload date:
- Size: 55.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01e95fac58e2bdad5670587dc1ea795fb34759d12f141b7eff00e006342f79da
|
|
| MD5 |
ddf066994630437416eb251db5737d8a
|
|
| BLAKE2b-256 |
640b80050b0ba86e29df7ba70ab718039ad6aaf9db6002a983c9ba90e3a8cab4
|