Detect AI-generated content, trace origins, verify authenticity
Project description
AI Provenance Tracker - Backend
FastAPI backend for detecting AI-generated content.
Quick Start
# Create virtual environment
python -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install -e ".[dev]"
# Run the server
uvicorn app.main:app --reload
API Endpoints
POST /api/v1/detect/text- Detect AI-generated textPOST /api/v1/detect/image- Detect AI-generated imagesPOST /api/v1/detect/audio- Detect AI-generated audio (WAV)POST /api/v1/detect/video- Detect AI-generated video (MVP)POST /api/v1/batch/text- Batch text detectionPOST /api/v1/intel/x/collect- Collect X data into trust-and-safety input schemaPOST /api/v1/intel/x/collect/estimate- Estimate X request cost without external callsPOST /api/v1/intel/x/report- Generate trust-and-safety report from normalized inputPOST /api/v1/intel/x/drilldown- Build cluster/claim drill-down + alerts datasetGET /api/v1/intel/x/scheduler/status- Check recurring job statusPOST /api/v1/intel/x/scheduler/run- Trigger one immediate scheduled runGET /api/v1/analyze/dashboard- Dashboard-ready analytics metricsGET /api/v1/analyze/evaluation- Calibration precision/recall trend for dashboardGET /api/v1/analyze/audit-events- Audit log events (HTTP + detection)GET /health- Health check
X Intelligence Collection
Set X_BEARER_TOKEN in .env, then either call API:
curl -X POST "http://localhost:8000/api/v1/intel/x/collect" \
-H "Content-Type: application/json" \
-d '{"target_handle":"@example","window_days":30,"max_posts":300,"query":"anthropic OR claudecode"}'
or use CLI utility:
python scripts/collect_x_input.py --handle @example --window-days 30 --max-posts 300 --query "anthropic OR claudecode" --output ./x_intel_input.json --show-request-estimate
Low-cost run (tight request budget):
X_MAX_PAGES=1 X_MAX_REQUESTS_PER_RUN=4 python scripts/collect_x_input.py --handle @example --window-days 7 --max-posts 60 --output ./x_intel_input.json --show-request-estimate
Cost precheck endpoint (no external X calls):
curl -X POST "http://localhost:8000/api/v1/intel/x/collect/estimate" \
-H "Content-Type: application/json" \
-d '{"window_days":7,"max_posts":60,"max_pages":1}'
Batch text detection:
curl -X POST "http://localhost:8000/api/v1/batch/text" \
-H "Content-Type: application/json" \
-d '{"items":[{"item_id":"a","text":"Sample text one..."},{"item_id":"b","text":"Sample text two..."}]}'
Dashboard metrics:
curl "http://localhost:8000/api/v1/analyze/dashboard?days=30"
Audit events:
curl "http://localhost:8000/api/v1/analyze/audit-events?limit=50"
Dashboard drill-down from normalized input:
curl -X POST "http://localhost:8000/api/v1/intel/x/drilldown" \
-H "Content-Type: application/json" \
--data-binary @./x_intel_input.json
Trust Report, Benchmark, Evidence Pack
Generate trust report:
python scripts/generate_x_trust_report.py --input ./x_intel_input.json --output ./x_trust_report.json
Benchmark (optional labels file):
python scripts/benchmark_x_intel.py --report ./x_trust_report.json --labels ./evidence/labels_template.json --output ./x_trust_benchmark.json
Build talent-visa evidence pack:
python scripts/build_talent_visa_evidence_pack.py --reports-glob "./x_trust_report*.json" --benchmarks-glob "./x_trust_benchmark*.json" --output-dir ./evidence
Run full pipeline:
python scripts/run_talent_visa_pipeline.py --handle @example --window-days 90 --max-posts 600 --query "anthropic OR claudecode OR claudeai OR usagelimits"
Run pipeline from pre-collected input JSON (offline mode):
python scripts/run_talent_visa_pipeline.py --input-json ./x_intel_input.json --output-dir ./evidence/runs/manual_input --run-id run_snapshot
Compare two run directories:
python scripts/compare_talent_visa_runs.py --base-run-dir ./evidence/runs/run_a --candidate-run-dir ./evidence/runs/run_b --output-json ./evidence/runs/comparisons/run_a_vs_run_b.json --output-md ./evidence/runs/comparisons/run_a_vs_run_b.md
Evaluate confidence-threshold calibration on labeled data:
python scripts/evaluate_detection_calibration.py --input ./labels_text.jsonl --content-type text --output ./calibration_text.json --register
python scripts/evaluate_detection_calibration.py --input ./labels_audio.jsonl --content-type audio --output ./calibration_audio.json --register
python scripts/evaluate_detection_calibration.py --input ./labels_video.jsonl --content-type video --output ./calibration_video.json --register
Audio/video JSONL templates: ./evidence/samples/audio_labeled_template.jsonl, ./evidence/samples/video_labeled_template.jsonl
Weekly pipeline cycle with automatic run comparison:
python scripts/run_weekly_talent_visa_cycle.py --handle @example --window-days 7 --max-posts 60 --output-dir ./evidence/runs/weekly --comparisons-dir ./evidence/runs/comparisons --summary-output ./evidence/runs/weekly/latest_summary.json
Production smoke test for all detect endpoints:
python scripts/smoke_detect_prod.py --base-url https://your-api-domain --output ./evidence/smoke/prod_detect_smoke.json
Run background worker process (scheduler + webhook retry queue):
python -m app.worker.main
Trigger a scheduler run manually:
curl -X POST "http://localhost:8000/api/v1/intel/x/scheduler/run?handle=@example"
Check scheduler status:
curl "http://localhost:8000/api/v1/intel/x/scheduler/status"
Persistence and Migrations
Runtime analysis history is persisted in analysis_records (SQLite by default).
Audit events are persisted in audit_events.
alembic upgrade head
Security and Spend Controls
Configure optional API key enforcement and endpoint spend controls in .env:
REQUIRE_API_KEYAPI_KEYSDAILY_SPEND_CAP_POINTSRATE_LIMIT_MEDIA_REQUESTSRATE_LIMIT_BATCH_REQUESTSRATE_LIMIT_INTEL_REQUESTSX_COST_GUARD_ENABLEDX_MAX_REQUESTS_PER_RUNCONSENSUS_ENABLEDCOPYLEAKS_API_KEYREALITY_DEFENDER_API_KEYSCHEDULER_ENABLEDSCHEDULER_HANDLESSCHEDULER_MONTHLY_REQUEST_CAPSCHEDULER_KILL_SWITCH_ON_CAPSCHEDULER_USAGE_FILERUN_SCHEDULER_IN_APIWORKER_ENABLE_SCHEDULERWORKER_DRAIN_WEBHOOK_QUEUEWORKER_TICK_SECONDSWEBHOOK_URLSWEBHOOK_RETRY_ATTEMPTSWEBHOOK_RETRY_BACKOFF_SECONDSWEBHOOK_QUEUE_FILEWEBHOOK_DEAD_LETTER_FILEAUDIT_EVENTS_ENABLEDAUDIT_LOG_HTTP_REQUESTSAUDIT_ACTOR_HEADER
Documentation
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai_provenance_tracker-1.0.1.tar.gz.
File metadata
- Download URL: ai_provenance_tracker-1.0.1.tar.gz
- Upload date:
- Size: 263.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
759c84fd96fd41a85a1478aebece0f40a4297d3e3cc90f74fb4960c0f77a0df8
|
|
| MD5 |
852eee42bf6eb688fb72571e1fe2d8ff
|
|
| BLAKE2b-256 |
87fb40758e08d03647a095fdf6f8330595626b8ef0d27da21b1ce3cacf0dd79e
|
Provenance
The following attestation bundles were made for ai_provenance_tracker-1.0.1.tar.gz:
Publisher:
publish-pypi.yml on ogulcanaydogan/AI-Provenance-Tracker
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ai_provenance_tracker-1.0.1.tar.gz -
Subject digest:
759c84fd96fd41a85a1478aebece0f40a4297d3e3cc90f74fb4960c0f77a0df8 - Sigstore transparency entry: 1968925122
- Sigstore integration time:
-
Permalink:
ogulcanaydogan/AI-Provenance-Tracker@c13aefd1f1cc0913225f866c71a38c4e977ff92b -
Branch / Tag:
refs/tags/v1.0.1 - Owner: https://github.com/ogulcanaydogan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@c13aefd1f1cc0913225f866c71a38c4e977ff92b -
Trigger Event:
release
-
Statement type:
File details
Details for the file ai_provenance_tracker-1.0.1-py3-none-any.whl.
File metadata
- Download URL: ai_provenance_tracker-1.0.1-py3-none-any.whl
- Upload date:
- Size: 102.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b83a4fd09a19794d6e9dd03af566772f8d160720a4caee8cf16bd19eb304b9ff
|
|
| MD5 |
6e4c4f72f59c2f470f05f7232b5d2d8a
|
|
| BLAKE2b-256 |
c39409ed9449317c37267534787c2626ce90aa4a04929f4c0196a24ff5475b8b
|
Provenance
The following attestation bundles were made for ai_provenance_tracker-1.0.1-py3-none-any.whl:
Publisher:
publish-pypi.yml on ogulcanaydogan/AI-Provenance-Tracker
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ai_provenance_tracker-1.0.1-py3-none-any.whl -
Subject digest:
b83a4fd09a19794d6e9dd03af566772f8d160720a4caee8cf16bd19eb304b9ff - Sigstore transparency entry: 1968925218
- Sigstore integration time:
-
Permalink:
ogulcanaydogan/AI-Provenance-Tracker@c13aefd1f1cc0913225f866c71a38c4e977ff92b -
Branch / Tag:
refs/tags/v1.0.1 - Owner: https://github.com/ogulcanaydogan
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@c13aefd1f1cc0913225f866c71a38c4e977ff92b -
Trigger Event:
release
-
Statement type: