Causal-Multimodal Engine for Creative Performance Attribution
Project description
The engine that sees your creatives (Gemini Embedding 2), extracts their DNA automatically, and proves what causes performance.
Not vibes. Not correlations. Proof.
Quick Start · Playground · How It Works · Features · Docs · API · Contributing
Everyone knows which creatives performed. Nobody knows why. OmniProof does.
Upload any creative — video, image, PDF. OmniProof embeds it (Gemini Embedding 2), extracts structured metadata automatically, estimates causal effects with Double Machine Learning, and checks brand compliance via RAG. The output isn't a dashboard of correlations. It's proof: "blue backgrounds cause a +12% CTR uplift for 18-24, controlling for platform, budget, and seasonality."
Quick Start
pip install omni-proof
Or install from source
git clone https://github.com/navidgh66/omni_proof.git
cd omni_proof
pip install -e ".[dev]"
Requires Python 3.11+. The causal analysis layer works with local data only — no API keys needed. For the full pipeline you'll need a Gemini API key and a Pinecone account.
Try it now — zero setup, zero API keys:
python examples/demo.py # 9-stage pipeline demo in ~4 seconds
Or use it as a library:
import pandas as pd
from omni_proof import DMLEstimator
data = pd.read_csv("examples/data/campaign_performance.csv")
estimator = DMLEstimator(cv=5, n_estimators=50)
# Does fast pacing *cause* higher CTR?
ate = estimator.estimate_ate(data, "fast_pacing", "ctr",
["platform", "audience_segment", "daily_budget_usd"])
print(f"ATE: {ate.ate:+.4f} (p={ate.p_value:.4f})")
# ATE: +0.0189 (p=0.0001)
# For which segments?
cate = estimator.estimate_cate(data, "fast_pacing", "ctr",
["platform", "daily_budget_usd"], segment_col="audience_segment")
for seg, eff in cate.segments.items():
print(f" {seg}: {eff.effect:+.4f} (p={eff.p_value:.4f})")
# Gen-Z 18-24: +0.032 (p=0.001) — strong
# Millennials: +0.015 (p=0.023) — moderate
# Gen-X 35-44: +0.003 (p=0.412) — not significant
Full pipeline example (requires API keys)
import asyncio
from pathlib import Path
from omni_proof import (
BrandExtractor, ComplianceChain, GeminiClient, Settings,
)
from omni_proof.storage.memory_store import InMemoryVectorStore
from omni_proof.rag.brand_retriever import BrandRetriever
settings = Settings() # reads OMNI_PROOF_* env vars
client = GeminiClient(api_key=settings.gemini_api_key)
store = InMemoryVectorStore()
async def main():
# Extract brand identity from assets
extractor = BrandExtractor(client, client, store)
profile = await extractor.extract("AcmeCorp",
[Path("brand_guide.pdf"), Path("logo.png")])
# Check a new creative for brand compliance
retriever = BrandRetriever(client, store)
chain = ComplianceChain(client, retriever)
report = await chain.check_compliance("ad_001", Path("new_ad.jpg"))
print(f"Compliant: {report.passed} (score: {report.score})")
asyncio.run(main())
Hands-On Playground
The examples/playground.ipynb notebook walks through every OmniProof capability interactively — from embedding creatives into Pinecone, to causal analysis, to DICE-DML on A/B video pairs.
| Walkthrough | What it shows | API keys? |
|---|---|---|
| 0. Offline Demo | All 9 pipeline stages in ~4s | No |
| 1. Embed into Pinecone | Gemini Embedding 2 → Pinecone storage | Yes |
| 2. Extract Metadata | Structured creative metadata via Flash Lite | Yes |
| 3. Brand Identity | Multimodal brand profile extraction | Yes |
| 4. Brand Compliance | RAG: embed guidelines → retrieve → evaluate | Yes |
| 5. Causal Analysis | DAG → DML → ATE/CATE → refutation | No |
| 6. Embeddings + Causal | Merge Pinecone embeddings with campaign data | Yes |
| 7. DICE-DML | Counterfactual pairs → treatment fingerprint → visual ATE | Yes |
| 8. Design Brief | Causal insights → creative prompt generation | No |
| 9. API Server | FastAPI endpoints for all capabilities | Yes |
pip install -e ".[dev]"
jupyter notebook examples/playground.ipynb
What's in examples/
examples/
creatives/
runner_sunrise_*.png # 10 image creatives (one per concept)
trail_epic_*.png
hiit_studio_*.png
...
runner_sunrise_fast_pacing_A.mp4 # A/B video variants
runner_sunrise_slow_pacing_B.mp4 # (fast_pacing treatment)
basketball_court_fast_pacing_A.mp4
basketball_court_slow_pacing_B.mp4
data/
campaign_performance.csv # 1,000 rows with planted causal effects
brand_profile.json # Velocity Sportswear brand profile
brand_guidelines.json # 12 brand rules for RAG
compliance_samples.json # 5 compliance reports (PASS/WARN/FAIL)
creative_metadata_samples.json # 14 records (10 images + 4 videos)
Highlights
- Causal Engine -- DML + refutation tests isolate true treatment effects from confounders. Not correlations.
- Multimodal Embeddings -- Gemini Embedding 2 maps video, images, audio, PDFs, and text into a shared 3072-dim space.
- Brand Intelligence -- Extract structured brand guidelines from any asset, then auto-check new creatives for compliance.
- DICE-DML -- Disentangle visual confounders from treatment signals using counterfactual embedding pairs.
- Creative Generation -- Causal insights feed directly into optimized creative prompts.
- REST API -- 12 endpoints covering brand extraction, compliance, causal analysis, and generation.
- Modular -- Use the full pipeline or any layer independently as a library.
How It Works
Upload creatives (video, image, PDF, audio)
|
v
+---------------------------+
| Gemini Embedding 2 | 3072-dim multimodal embeddings
| Gemini 3.1 Flash Lite | Structured feature extraction
+------------+--------------+
|
+------+------+
v v
Pinecone SQL DB
(vectors) (metadata + outcomes)
| |
+------+------+
v
+---------------------------+
| Causal Engine | DAG -> Identify -> DML -> Refute
| (DoWhy + EconML) | DICE-DML for visual embeddings
+------------+--------------+
|
+------+-----------+
v v
Brand Compliance Creative Generation
(RAG retrieval) (causal-informed prompts)
Architecture
OmniProof is organized into five layers. Each can be used independently:
| Layer | Module | Key Classes |
|---|---|---|
| Ingestion | omni_proof.ingestion |
GeminiClient, AssetPreprocessor, IngestPipeline |
| Storage | omni_proof.storage |
PineconeVectorStore, InMemoryVectorStore, RelationalStore |
| Causal | omni_proof.causal |
CausalDAGBuilder, DMLEstimator, CausalRefuter, VisualDMLEstimator |
| Orchestration | omni_proof.orchestration |
ComplianceChain, InsightSynthesizer, BrandExtractor |
| API | omni_proof.api |
FastAPI app, routes, GenerativePromptBuilder |
Key Abstractions
| Interface | Purpose | Implementations |
|---|---|---|
EmbeddingProvider |
Generate embeddings from any content | GeminiClient |
VectorStore |
Store and search vectors | PineconeVectorStore, InMemoryVectorStore |
Estimator |
Estimate causal effects | DMLEstimator |
Configuration
Set environment variables with the OMNI_PROOF_ prefix, or pass them programmatically:
OMNI_PROOF_GEMINI_API_KEY=AIza...
OMNI_PROOF_PINECONE_API_KEY=pcsk_...
OMNI_PROOF_PINECONE_INDEX_HOST=https://my-index-abc123.svc.pinecone.io
OMNI_PROOF_DATABASE_URL=sqlite+aiosqlite:///./omni_proof.db # default
| Variable | Required For | Where to Get |
|---|---|---|
OMNI_PROOF_GEMINI_API_KEY |
Embeddings + extraction | Google AI Studio |
OMNI_PROOF_PINECONE_API_KEY |
Vector storage | Pinecone Console |
OMNI_PROOF_PINECONE_INDEX_HOST |
Vector storage | Pinecone Console |
OMNI_PROOF_DATABASE_URL |
Relational storage | PostgreSQL or SQLite URI |
Causal Methodology
OmniProof implements a four-stage causal pipeline:
- Model -- Build a DAG mapping treatments, outcomes, and confounders
- Identify -- Apply the backdoor criterion to find valid adjustment sets
- Estimate -- Double Machine Learning (Neyman orthogonalization) via EconML
- Refute -- Placebo tests, subset validation, and random confounder checks
For visual embeddings where treatment and confounders are entangled, DICE-DML generates counterfactual pairs, isolates treatment fingerprints via vector subtraction, and applies orthogonal projection before estimation.
Gemini Embedding 2
All modalities map to the same 3072-dimensional semantic space via Gemini Embedding 2:
| Modality | Limit |
|---|---|
| Text | 8,192 tokens |
| Images | 6 per request |
| Video | 80s (with audio) / 120s (without) |
| Audio | 80s |
| 1 document, 6 pages | |
| Output | 3,072 dims (Matryoshka: truncate to 1536 / 768 / 128) |
API Reference
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check |
POST |
/api/v1/brand/extract |
Extract brand profile from uploaded assets |
POST |
/api/v1/brand/update/{id} |
Update brand profile with new assets |
GET |
/api/v1/brand/profile/{id} |
Retrieve a brand profile |
POST |
/api/v1/compliance/check |
Check creative for brand compliance |
GET |
/api/v1/compliance/reports |
Historical compliance reports |
POST |
/api/v1/causal/analyze |
Trigger causal analysis |
GET |
/api/v1/causal/effects |
List estimated causal effects |
GET |
/api/v1/causal/effects/{treatment} |
CATE breakdown by segment |
GET |
/api/v1/insights/briefs |
Design briefs from causal data |
GET |
/api/v1/insights/segments |
Effects by audience segment |
POST |
/api/v1/generative/prompt |
Generate optimized creative prompt |
Start the server:
uvicorn omni_proof.api.app:create_app --factory --reload
curl localhost:8000/health # {"status": "ok"}
Documentation
For in-depth guides, see the docs/ directory:
| Document | Description |
|---|---|
| Getting Started | Installation, environment setup, and first run |
| Architecture | System design, layer overview, and data flow |
| API Reference | Complete class and method reference |
| Configuration | Environment variables, settings, and constants |
| Causal Methodology | DML, CATE, DICE-DML, and refutation explained |
| REST API | HTTP endpoints, request/response schemas |
| Examples & Tutorials | Walkthrough of demo, playground, and common workflows |
Tech Stack
| Component | Technology |
|---|---|
| Embeddings | Gemini Embedding 2 |
| Structured extraction | Gemini 3.1 Flash Lite |
| Vector DB | Pinecone Serverless |
| Relational DB | PostgreSQL / SQLite |
| Causal inference | DoWhy + EconML |
| API | FastAPI |
| ML models | LightGBM |
| Schemas | Pydantic v2 + SQLAlchemy 2.0 |
Testing
pytest tests/unit/ -v # 150 unit tests
pytest tests/integration/ -v # 53 integration tests
pytest tests/ -v # All 203 tests
ruff check src/ tests/ # Lint
Star History
Contributing
See CONTRIBUTING.md for development setup, testing, and PR guidelines.
License
MIT -- OmniProof Contributors
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file omni_proof-0.0.5.tar.gz.
File metadata
- Download URL: omni_proof-0.0.5.tar.gz
- Upload date:
- Size: 23.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
90e1f1a2c64ecb1cab9ff2eb7c38e6bdf854971a2a87c51b615053438df2e6d0
|
|
| MD5 |
e28d944993deba1e2f54ae071cfe3f03
|
|
| BLAKE2b-256 |
035efe8872edbed13c98425335e99f389a5806950e2a97e41e97f841fd14c502
|
Provenance
The following attestation bundles were made for omni_proof-0.0.5.tar.gz:
Publisher:
release.yml on navidgh66/omni_proof
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
omni_proof-0.0.5.tar.gz -
Subject digest:
90e1f1a2c64ecb1cab9ff2eb7c38e6bdf854971a2a87c51b615053438df2e6d0 - Sigstore transparency entry: 1181174184
- Sigstore integration time:
-
Permalink:
navidgh66/omni_proof@970035c5fa89fe77a71b6234c28a0c35534cbe3c -
Branch / Tag:
refs/tags/v0.0.5 - Owner: https://github.com/navidgh66
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@970035c5fa89fe77a71b6234c28a0c35534cbe3c -
Trigger Event:
push
-
Statement type:
File details
Details for the file omni_proof-0.0.5-py3-none-any.whl.
File metadata
- Download URL: omni_proof-0.0.5-py3-none-any.whl
- Upload date:
- Size: 46.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
346f5c21848f5bbabc9151007f7a902883d8a078016183e490418e6b98059a9f
|
|
| MD5 |
5af83276289a08a5216e8ff8726d97a0
|
|
| BLAKE2b-256 |
ce2277a89e3d9995bc27ea11d716e700861b7050c3411caaa546d4a03263dd8d
|
Provenance
The following attestation bundles were made for omni_proof-0.0.5-py3-none-any.whl:
Publisher:
release.yml on navidgh66/omni_proof
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
omni_proof-0.0.5-py3-none-any.whl -
Subject digest:
346f5c21848f5bbabc9151007f7a902883d8a078016183e490418e6b98059a9f - Sigstore transparency entry: 1181174198
- Sigstore integration time:
-
Permalink:
navidgh66/omni_proof@970035c5fa89fe77a71b6234c28a0c35534cbe3c -
Branch / Tag:
refs/tags/v0.0.5 - Owner: https://github.com/navidgh66
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@970035c5fa89fe77a71b6234c28a0c35534cbe3c -
Trigger Event:
push
-
Statement type: