Skip to main content

Detect AI-generated content, trace origins, verify authenticity

Project description

AI Provenance Tracker - Backend

FastAPI backend for detecting AI-generated content.

Quick Start

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -e ".[dev]"

# Run the server
uvicorn app.main:app --reload

API Endpoints

  • POST /api/v1/detect/text - Detect AI-generated text
  • POST /api/v1/detect/image - Detect AI-generated images
  • POST /api/v1/detect/audio - Detect AI-generated audio (WAV)
  • POST /api/v1/detect/video - Detect AI-generated video (MVP)
  • POST /api/v1/batch/text - Batch text detection
  • POST /api/v1/intel/x/collect - Collect X data into trust-and-safety input schema
  • POST /api/v1/intel/x/collect/estimate - Estimate X request cost without external calls
  • POST /api/v1/intel/x/report - Generate trust-and-safety report from normalized input
  • POST /api/v1/intel/x/drilldown - Build cluster/claim drill-down + alerts dataset
  • GET /api/v1/intel/x/scheduler/status - Check recurring job status
  • POST /api/v1/intel/x/scheduler/run - Trigger one immediate scheduled run
  • GET /api/v1/analyze/dashboard - Dashboard-ready analytics metrics
  • GET /api/v1/analyze/evaluation - Calibration precision/recall trend for dashboard
  • GET /api/v1/analyze/audit-events - Audit log events (HTTP + detection)
  • GET /health - Health check

X Intelligence Collection

Set X_BEARER_TOKEN in .env, then either call API:

curl -X POST "http://localhost:8000/api/v1/intel/x/collect" \
  -H "Content-Type: application/json" \
  -d '{"target_handle":"@example","window_days":30,"max_posts":300,"query":"anthropic OR claudecode"}'

or use CLI utility:

python scripts/collect_x_input.py --handle @example --window-days 30 --max-posts 300 --query "anthropic OR claudecode" --output ./x_intel_input.json --show-request-estimate

Low-cost run (tight request budget):

X_MAX_PAGES=1 X_MAX_REQUESTS_PER_RUN=4 python scripts/collect_x_input.py --handle @example --window-days 7 --max-posts 60 --output ./x_intel_input.json --show-request-estimate

Cost precheck endpoint (no external X calls):

curl -X POST "http://localhost:8000/api/v1/intel/x/collect/estimate" \
  -H "Content-Type: application/json" \
  -d '{"window_days":7,"max_posts":60,"max_pages":1}'

Batch text detection:

curl -X POST "http://localhost:8000/api/v1/batch/text" \
  -H "Content-Type: application/json" \
  -d '{"items":[{"item_id":"a","text":"Sample text one..."},{"item_id":"b","text":"Sample text two..."}]}'

Dashboard metrics:

curl "http://localhost:8000/api/v1/analyze/dashboard?days=30"

Audit events:

curl "http://localhost:8000/api/v1/analyze/audit-events?limit=50"

Dashboard drill-down from normalized input:

curl -X POST "http://localhost:8000/api/v1/intel/x/drilldown" \
  -H "Content-Type: application/json" \
  --data-binary @./x_intel_input.json

Trust Report, Benchmark, Evidence Pack

Generate trust report:

python scripts/generate_x_trust_report.py --input ./x_intel_input.json --output ./x_trust_report.json

Benchmark (optional labels file):

python scripts/benchmark_x_intel.py --report ./x_trust_report.json --labels ./evidence/labels_template.json --output ./x_trust_benchmark.json

Build talent-visa evidence pack:

python scripts/build_talent_visa_evidence_pack.py --reports-glob "./x_trust_report*.json" --benchmarks-glob "./x_trust_benchmark*.json" --output-dir ./evidence

Run full pipeline:

python scripts/run_talent_visa_pipeline.py --handle @example --window-days 90 --max-posts 600 --query "anthropic OR claudecode OR claudeai OR usagelimits"

Run pipeline from pre-collected input JSON (offline mode):

python scripts/run_talent_visa_pipeline.py --input-json ./x_intel_input.json --output-dir ./evidence/runs/manual_input --run-id run_snapshot

Compare two run directories:

python scripts/compare_talent_visa_runs.py --base-run-dir ./evidence/runs/run_a --candidate-run-dir ./evidence/runs/run_b --output-json ./evidence/runs/comparisons/run_a_vs_run_b.json --output-md ./evidence/runs/comparisons/run_a_vs_run_b.md

Evaluate confidence-threshold calibration on labeled data:

python scripts/evaluate_detection_calibration.py --input ./labels_text.jsonl --content-type text --output ./calibration_text.json --register
python scripts/evaluate_detection_calibration.py --input ./labels_audio.jsonl --content-type audio --output ./calibration_audio.json --register
python scripts/evaluate_detection_calibration.py --input ./labels_video.jsonl --content-type video --output ./calibration_video.json --register

Audio/video JSONL templates: ./evidence/samples/audio_labeled_template.jsonl, ./evidence/samples/video_labeled_template.jsonl

Weekly pipeline cycle with automatic run comparison:

python scripts/run_weekly_talent_visa_cycle.py --handle @example --window-days 7 --max-posts 60 --output-dir ./evidence/runs/weekly --comparisons-dir ./evidence/runs/comparisons --summary-output ./evidence/runs/weekly/latest_summary.json

Production smoke test for all detect endpoints:

python scripts/smoke_detect_prod.py --base-url https://your-api-domain --output ./evidence/smoke/prod_detect_smoke.json

Run background worker process (scheduler + webhook retry queue):

python -m app.worker.main

Trigger a scheduler run manually:

curl -X POST "http://localhost:8000/api/v1/intel/x/scheduler/run?handle=@example"

Check scheduler status:

curl "http://localhost:8000/api/v1/intel/x/scheduler/status"

Persistence and Migrations

Runtime analysis history is persisted in analysis_records (SQLite by default). Audit events are persisted in audit_events.

alembic upgrade head

Security and Spend Controls

Configure optional API key enforcement and endpoint spend controls in .env:

  • REQUIRE_API_KEY
  • API_KEYS
  • DAILY_SPEND_CAP_POINTS
  • RATE_LIMIT_MEDIA_REQUESTS
  • RATE_LIMIT_BATCH_REQUESTS
  • RATE_LIMIT_INTEL_REQUESTS
  • X_COST_GUARD_ENABLED
  • X_MAX_REQUESTS_PER_RUN
  • CONSENSUS_ENABLED
  • COPYLEAKS_API_KEY
  • REALITY_DEFENDER_API_KEY
  • SCHEDULER_ENABLED
  • SCHEDULER_HANDLES
  • SCHEDULER_MONTHLY_REQUEST_CAP
  • SCHEDULER_KILL_SWITCH_ON_CAP
  • SCHEDULER_USAGE_FILE
  • RUN_SCHEDULER_IN_API
  • WORKER_ENABLE_SCHEDULER
  • WORKER_DRAIN_WEBHOOK_QUEUE
  • WORKER_TICK_SECONDS
  • WEBHOOK_URLS
  • WEBHOOK_RETRY_ATTEMPTS
  • WEBHOOK_RETRY_BACKOFF_SECONDS
  • WEBHOOK_QUEUE_FILE
  • WEBHOOK_DEAD_LETTER_FILE
  • AUDIT_EVENTS_ENABLED
  • AUDIT_LOG_HTTP_REQUESTS
  • AUDIT_ACTOR_HEADER

Documentation

  • Swagger UI: http://localhost:8000/docs
  • ReDoc: http://localhost:8000/redoc

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_provenance_tracker-1.0.1.tar.gz (263.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_provenance_tracker-1.0.1-py3-none-any.whl (102.8 kB view details)

Uploaded Python 3

File details

Details for the file ai_provenance_tracker-1.0.1.tar.gz.

File metadata

  • Download URL: ai_provenance_tracker-1.0.1.tar.gz
  • Upload date:
  • Size: 263.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ai_provenance_tracker-1.0.1.tar.gz
Algorithm Hash digest
SHA256 759c84fd96fd41a85a1478aebece0f40a4297d3e3cc90f74fb4960c0f77a0df8
MD5 852eee42bf6eb688fb72571e1fe2d8ff
BLAKE2b-256 87fb40758e08d03647a095fdf6f8330595626b8ef0d27da21b1ce3cacf0dd79e

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_provenance_tracker-1.0.1.tar.gz:

Publisher: publish-pypi.yml on ogulcanaydogan/AI-Provenance-Tracker

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ai_provenance_tracker-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for ai_provenance_tracker-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b83a4fd09a19794d6e9dd03af566772f8d160720a4caee8cf16bd19eb304b9ff
MD5 6e4c4f72f59c2f470f05f7232b5d2d8a
BLAKE2b-256 c39409ed9449317c37267534787c2626ce90aa4a04929f4c0196a24ff5475b8b

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_provenance_tracker-1.0.1-py3-none-any.whl:

Publisher: publish-pypi.yml on ogulcanaydogan/AI-Provenance-Tracker

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page