Causal-Multimodal Engine for Creative Performance Attribution
Reason this release was yanked:
Pre-release, not ready for use
Project description
From correlation to causation.
Upload creatives, discover why they perform.
Installation · Quick Start · How It Works · Features · API · Contributing
OmniProof is an open-source Python engine that answers why creative assets perform differently. It replaces gut-feel marketing analytics with rigorous causal inference -- moving from "ads with blue backgrounds got more clicks" to "blue backgrounds cause a +12% CTR uplift for the 18-24 segment, controlling for platform, budget, and seasonality."
It combines Gemini Embedding 2 for native multimodal understanding, Double Machine Learning for causal estimation, and RAG-based brand compliance into a single, modular pipeline.
Highlights
- Causal Engine -- DML + refutation tests isolate true treatment effects from confounders. Not correlations.
- Multimodal Embeddings -- Gemini Embedding 2 maps video, images, audio, PDFs, and text into a shared 3072-dim space.
- Brand Intelligence -- Extract structured brand guidelines from any asset, then auto-check new creatives for compliance.
- DICE-DML -- Disentangle visual confounders from treatment signals using counterfactual embedding pairs.
- Creative Generation -- Causal insights feed directly into optimized creative prompts.
- REST API -- 12 endpoints covering brand extraction, compliance, causal analysis, and generation.
- Modular -- Use the full pipeline or any layer independently as a library.
Installation
pip install omni-proof
Or install from source:
git clone https://github.com/navidgh66/omni_proof.git
cd omni_proof
pip install -e ".[dev]"
Requires Python 3.11+. For the full pipeline you'll need a Gemini API key and a Pinecone account. The causal analysis layer works with local data only -- no API keys needed.
Quick Start
import pandas as pd
from pathlib import Path
from omni_proof import BrandExtractor, ComplianceChain, DMLEstimator, GeminiClient, Settings
from omni_proof.storage.memory_store import InMemoryVectorStore
from omni_proof.rag.brand_retriever import BrandRetriever
settings = Settings(gemini_api_key="AIza...", pinecone_api_key="pcsk_...",
pinecone_index_host="https://my-index.svc.pinecone.io")
client = GeminiClient(api_key=settings.gemini_api_key)
store = InMemoryVectorStore()
# 1. Extract brand identity from assets
extractor = BrandExtractor(embedding_provider=client, gemini_client=client, vector_store=store)
profile = await extractor.extract("AcmeCorp", [Path("brand_guide.pdf"), Path("logo.png")])
# 2. Check a new creative for brand compliance
retriever = BrandRetriever(gemini_client=client, vector_store=store)
chain = ComplianceChain(gemini_client=client, brand_retriever=retriever)
report = await chain.check_compliance("ad_001", Path("new_ad.jpg"))
print(f"Compliant: {report.passed} (score: {report.score})")
# 3. Estimate causal effect of a creative feature (no API keys needed)
data = pd.read_csv("campaign_data.csv")
estimator = DMLEstimator(cv=5, n_estimators=50)
ate = estimator.estimate_ate(data, "fast_pacing", "ctr", ["platform", "audience_segment", "budget"])
print(f"ATE: {ate.ate:+.3f} (p={ate.p_value:.4f})")
Or start the API server:
uvicorn omni_proof.api.app:create_app --factory --reload
curl localhost:8000/health # {"status": "ok"}
How It Works
Upload creatives (video, image, PDF, audio)
|
v
+---------------------------+
| Gemini Embedding 2 | 3072-dim multimodal embeddings
| Gemini 3.1 Flash Lite | Structured feature extraction
+------------+--------------+
|
+------+------+
v v
Pinecone SQL DB
(vectors) (metadata + outcomes)
| |
+------+------+
v
+---------------------------+
| Causal Engine | DAG -> Identify -> DML -> Refute
| (DoWhy + EconML) | DICE-DML for visual embeddings
+------------+--------------+
|
+------+-----------+
v v
Brand Compliance Creative Generation
(RAG retrieval) (causal-informed prompts)
Configuration
Set environment variables with the OMNI_PROOF_ prefix, or pass them programmatically:
OMNI_PROOF_GEMINI_API_KEY=AIza...
OMNI_PROOF_PINECONE_API_KEY=pcsk_...
OMNI_PROOF_PINECONE_INDEX_HOST=https://my-index-abc123.svc.pinecone.io
OMNI_PROOF_DATABASE_URL=sqlite+aiosqlite:///./omni_proof.db # default
| Variable | Required For | Where to Get |
|---|---|---|
OMNI_PROOF_GEMINI_API_KEY |
Embeddings + extraction | Google AI Studio |
OMNI_PROOF_PINECONE_API_KEY |
Vector storage | Pinecone Console |
OMNI_PROOF_PINECONE_INDEX_HOST |
Vector storage | Pinecone Console |
OMNI_PROOF_DATABASE_URL |
Relational storage | PostgreSQL or SQLite URI |
Architecture
OmniProof is organized into five layers. Each can be used independently:
| Layer | Module | Key Classes |
|---|---|---|
| Ingestion | omni_proof.ingestion |
GeminiClient, AssetPreprocessor, IngestPipeline |
| Storage | omni_proof.storage |
PineconeVectorStore, InMemoryVectorStore, RelationalStore |
| Causal | omni_proof.causal |
CausalDAGBuilder, DMLEstimator, CausalRefuter, VisualDMLEstimator |
| Orchestration | omni_proof.orchestration |
ComplianceChain, InsightSynthesizer, BrandExtractor |
| API | omni_proof.api |
FastAPI app, routes, GenerativePromptBuilder |
Key Abstractions
| Interface | Purpose | Implementations |
|---|---|---|
EmbeddingProvider |
Generate embeddings from any content | GeminiClient |
VectorStore |
Store and search vectors | PineconeVectorStore, InMemoryVectorStore |
Estimator |
Estimate causal effects | DMLEstimator |
Advanced Usage
Causal pipeline with DAG + refutation
from omni_proof.causal.dag_builder import CausalDAGBuilder
from omni_proof.causal.identifier import CausalIdentifier
from omni_proof.causal.refuter import CausalRefuter
dag = CausalDAGBuilder()
model = dag.build_dag(data, treatment="fast_pacing", outcome="ctr",
confounders=["platform", "audience_segment", "budget"])
estimand = CausalIdentifier().identify_effect(model)
ate = DMLEstimator().estimate_ate(data, "fast_pacing", "ctr",
["platform", "audience_segment", "budget"])
refuter = CausalRefuter()
placebo = refuter.placebo_test(data, "fast_pacing", "ctr",
["platform", "audience_segment", "budget"])
print(f"Effect: {ate.ate:+.3f}, Placebo passed: {placebo.passed}")
Brand extraction with conflict detection
profile = await extractor.extract("AcmeCorp", [
Path("brand_guide.pdf"), Path("approved_ad.jpg"),
Path("brand_video.mp4"), Path("jingle.mp3"),
])
print(f"Colors: {profile.visual_style.dominant_colors}")
print(f"Voice: {profile.voice.formality}, {profile.voice.emotional_register}")
print(f"Rules: {len(profile.rules)}")
# Update with new assets -- detects conflicts
updated, conflicts = await extractor.update(profile, [Path("new_campaign.jpg")])
for c in conflicts:
print(f" CONFLICT [{c.severity}] {c.dimension}: {c.existing_value} -> {c.new_value}")
Creative generation from causal insights
from omni_proof.api.generative_loop import GenerativePromptBuilder
builder = GenerativePromptBuilder()
prompt = builder.build_prompt(
cate_insights=[{"treatment": "fast_pacing", "effect": 0.12}],
brand_rules=[{"description": "Use blue (#004E89) as primary color"}],
target_segment="18-24",
objective="conversion",
constraints=["16:9 aspect ratio", "max 15 seconds"],
)
CATE by segment + insight synthesis
cate = estimator.estimate_cate(data, "fast_pacing", "ctr",
confounder_cols=["platform", "budget"],
segment_col="audience_segment")
for segment, effect in cate.segments.items():
print(f" {segment}: {effect.effect:+.3f} (CI: {effect.ci_lower:.3f} to {effect.ci_upper:.3f})")
from omni_proof import InsightSynthesizer
brief = InsightSynthesizer(p_value_threshold=0.05).synthesize(cate)
print(f"{brief.finding} -> {brief.recommendation}")
API Reference
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check |
POST |
/api/v1/brand/extract |
Extract brand profile from uploaded assets |
POST |
/api/v1/brand/update/{id} |
Update brand profile with new assets |
GET |
/api/v1/brand/profile/{id} |
Retrieve a brand profile |
POST |
/api/v1/compliance/check |
Check creative for brand compliance |
GET |
/api/v1/compliance/reports |
Historical compliance reports |
POST |
/api/v1/causal/analyze |
Trigger causal analysis |
GET |
/api/v1/causal/effects |
List estimated causal effects |
GET |
/api/v1/causal/effects/{treatment} |
CATE breakdown by segment |
GET |
/api/v1/insights/briefs |
Design briefs from causal data |
GET |
/api/v1/insights/segments |
Effects by audience segment |
POST |
/api/v1/generative/prompt |
Generate optimized creative prompt |
Causal Methodology
OmniProof implements a four-stage causal pipeline:
- Model -- Build a DAG mapping treatments, outcomes, and confounders
- Identify -- Apply the backdoor criterion to find valid adjustment sets
- Estimate -- Double Machine Learning (Neyman orthogonalization) via EconML
- Refute -- Placebo tests, subset validation, and random confounder checks
For visual embeddings where treatment and confounders are entangled, DICE-DML generates counterfactual pairs, isolates treatment fingerprints via vector subtraction, and applies orthogonal projection before estimation.
Gemini Embedding 2
All modalities map to the same 3072-dimensional semantic space via Gemini Embedding 2:
| Modality | Limit |
|---|---|
| Text | 8,192 tokens |
| Images | 6 per request |
| Video | 80s (with audio) / 120s (without) |
| Audio | 80s |
| 1 document, 6 pages | |
| Output | 3,072 dims (Matryoshka: truncate to 1536 / 768 / 128) |
Tech Stack
| Component | Technology |
|---|---|
| Embeddings | Gemini Embedding 2 |
| Structured extraction | Gemini 3.1 Flash Lite |
| Vector DB | Pinecone Serverless |
| Relational DB | PostgreSQL / SQLite |
| Causal inference | DoWhy + EconML |
| API | FastAPI |
| ML models | LightGBM |
| Schemas | Pydantic v2 + SQLAlchemy 2.0 |
Testing
pytest tests/unit/ -v # 140 unit tests
pytest tests/integration/ -v # 17 integration tests
pytest tests/ -v # All 157 tests
ruff check src/ tests/ # Lint
Star History
Contributing
See CONTRIBUTING.md for development setup, testing, and PR guidelines.
License
MIT -- OmniProof Contributors
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file omni_proof-0.0.1.tar.gz.
File metadata
- Download URL: omni_proof-0.0.1.tar.gz
- Upload date:
- Size: 169.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a5642f51165286a200febbc3e0a1814540a3fb7c3eb08d79d54aa1cf17116ec
|
|
| MD5 |
23c95eaa40e3626394210ca6ea6b0447
|
|
| BLAKE2b-256 |
e87c271fbb8a7193d8ab7a16019a6b1ea05ef5f6d2f7c3390eb8df85a2fdcf28
|
Provenance
The following attestation bundles were made for omni_proof-0.0.1.tar.gz:
Publisher:
release.yml on navidgh66/omni_proof
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
omni_proof-0.0.1.tar.gz -
Subject digest:
8a5642f51165286a200febbc3e0a1814540a3fb7c3eb08d79d54aa1cf17116ec - Sigstore transparency entry: 1106123230
- Sigstore integration time:
-
Permalink:
navidgh66/omni_proof@d9baeb731cfb498b1efa3b3595439fc1233a67f6 -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/navidgh66
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d9baeb731cfb498b1efa3b3595439fc1233a67f6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file omni_proof-0.0.1-py3-none-any.whl.
File metadata
- Download URL: omni_proof-0.0.1-py3-none-any.whl
- Upload date:
- Size: 44.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e8b6c87ece3bda4bc93a2371bbf316dee74ea5a5455062d1846b1cfcb569d756
|
|
| MD5 |
47712553925c6e85308478e1532eddb3
|
|
| BLAKE2b-256 |
ae26e6c3f97ef5b632fa0cab684e1a7996e78f726f3aff5fd06709fc8d8ffd61
|
Provenance
The following attestation bundles were made for omni_proof-0.0.1-py3-none-any.whl:
Publisher:
release.yml on navidgh66/omni_proof
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
omni_proof-0.0.1-py3-none-any.whl -
Subject digest:
e8b6c87ece3bda4bc93a2371bbf316dee74ea5a5455062d1846b1cfcb569d756 - Sigstore transparency entry: 1106123272
- Sigstore integration time:
-
Permalink:
navidgh66/omni_proof@d9baeb731cfb498b1efa3b3595439fc1233a67f6 -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/navidgh66
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d9baeb731cfb498b1efa3b3595439fc1233a67f6 -
Trigger Event:
push
-
Statement type: