Skip to main content

Causal-Multimodal Engine for Creative Performance Attribution

Reason this release was yanked:

Pre-release, not ready for use

Project description

OmniProof

From correlation to causation.
Upload creatives, discover why they perform.

CI PyPI Python MIT License

Installation · Quick Start · How It Works · Features · API · Contributing


OmniProof is an open-source Python engine that answers why creative assets perform differently. It replaces gut-feel marketing analytics with rigorous causal inference -- moving from "ads with blue backgrounds got more clicks" to "blue backgrounds cause a +12% CTR uplift for the 18-24 segment, controlling for platform, budget, and seasonality."

It combines Gemini Embedding 2 for native multimodal understanding, Double Machine Learning for causal estimation, and RAG-based brand compliance into a single, modular pipeline.

Highlights

  • Causal Engine -- DML + refutation tests isolate true treatment effects from confounders. Not correlations.
  • Multimodal Embeddings -- Gemini Embedding 2 maps video, images, audio, PDFs, and text into a shared 3072-dim space.
  • Brand Intelligence -- Extract structured brand guidelines from any asset, then auto-check new creatives for compliance.
  • DICE-DML -- Disentangle visual confounders from treatment signals using counterfactual embedding pairs.
  • Creative Generation -- Causal insights feed directly into optimized creative prompts.
  • REST API -- 12 endpoints covering brand extraction, compliance, causal analysis, and generation.
  • Modular -- Use the full pipeline or any layer independently as a library.

Installation

pip install omni-proof

Or install from source:

git clone https://github.com/navidgh66/omni_proof.git
cd omni_proof
pip install -e ".[dev]"

Requires Python 3.11+. For the full pipeline you'll need a Gemini API key and a Pinecone account. The causal analysis layer works with local data only -- no API keys needed.

Quick Start

import pandas as pd
from pathlib import Path
from omni_proof import BrandExtractor, ComplianceChain, DMLEstimator, GeminiClient, Settings
from omni_proof.storage.memory_store import InMemoryVectorStore
from omni_proof.rag.brand_retriever import BrandRetriever

settings = Settings(gemini_api_key="AIza...", pinecone_api_key="pcsk_...",
                    pinecone_index_host="https://my-index.svc.pinecone.io")
client = GeminiClient(api_key=settings.gemini_api_key)
store = InMemoryVectorStore()

# 1. Extract brand identity from assets
extractor = BrandExtractor(embedding_provider=client, gemini_client=client, vector_store=store)
profile = await extractor.extract("AcmeCorp", [Path("brand_guide.pdf"), Path("logo.png")])

# 2. Check a new creative for brand compliance
retriever = BrandRetriever(gemini_client=client, vector_store=store)
chain = ComplianceChain(gemini_client=client, brand_retriever=retriever)
report = await chain.check_compliance("ad_001", Path("new_ad.jpg"))
print(f"Compliant: {report.passed} (score: {report.score})")

# 3. Estimate causal effect of a creative feature (no API keys needed)
data = pd.read_csv("campaign_data.csv")
estimator = DMLEstimator(cv=5, n_estimators=50)
ate = estimator.estimate_ate(data, "fast_pacing", "ctr", ["platform", "audience_segment", "budget"])
print(f"ATE: {ate.ate:+.3f} (p={ate.p_value:.4f})")

Or start the API server:

uvicorn omni_proof.api.app:create_app --factory --reload
curl localhost:8000/health  # {"status": "ok"}

How It Works

  Upload creatives (video, image, PDF, audio)
          |
          v
  +---------------------------+
  |  Gemini Embedding 2       |  3072-dim multimodal embeddings
  |  Gemini 3.1 Flash Lite         |  Structured feature extraction
  +------------+--------------+
               |
        +------+------+
        v             v
    Pinecone       SQL DB
    (vectors)      (metadata + outcomes)
        |             |
        +------+------+
               v
  +---------------------------+
  |  Causal Engine             |  DAG -> Identify -> DML -> Refute
  |  (DoWhy + EconML)          |  DICE-DML for visual embeddings
  +------------+--------------+
               |
        +------+-----------+
        v                  v
  Brand Compliance     Creative Generation
  (RAG retrieval)      (causal-informed prompts)

Configuration

Set environment variables with the OMNI_PROOF_ prefix, or pass them programmatically:

OMNI_PROOF_GEMINI_API_KEY=AIza...
OMNI_PROOF_PINECONE_API_KEY=pcsk_...
OMNI_PROOF_PINECONE_INDEX_HOST=https://my-index-abc123.svc.pinecone.io
OMNI_PROOF_DATABASE_URL=sqlite+aiosqlite:///./omni_proof.db  # default
Variable Required For Where to Get
OMNI_PROOF_GEMINI_API_KEY Embeddings + extraction Google AI Studio
OMNI_PROOF_PINECONE_API_KEY Vector storage Pinecone Console
OMNI_PROOF_PINECONE_INDEX_HOST Vector storage Pinecone Console
OMNI_PROOF_DATABASE_URL Relational storage PostgreSQL or SQLite URI

Architecture

OmniProof is organized into five layers. Each can be used independently:

Layer Module Key Classes
Ingestion omni_proof.ingestion GeminiClient, AssetPreprocessor, IngestPipeline
Storage omni_proof.storage PineconeVectorStore, InMemoryVectorStore, RelationalStore
Causal omni_proof.causal CausalDAGBuilder, DMLEstimator, CausalRefuter, VisualDMLEstimator
Orchestration omni_proof.orchestration ComplianceChain, InsightSynthesizer, BrandExtractor
API omni_proof.api FastAPI app, routes, GenerativePromptBuilder

Key Abstractions

Interface Purpose Implementations
EmbeddingProvider Generate embeddings from any content GeminiClient
VectorStore Store and search vectors PineconeVectorStore, InMemoryVectorStore
Estimator Estimate causal effects DMLEstimator

Advanced Usage

Causal pipeline with DAG + refutation
from omni_proof.causal.dag_builder import CausalDAGBuilder
from omni_proof.causal.identifier import CausalIdentifier
from omni_proof.causal.refuter import CausalRefuter

dag = CausalDAGBuilder()
model = dag.build_dag(data, treatment="fast_pacing", outcome="ctr",
                      confounders=["platform", "audience_segment", "budget"])

estimand = CausalIdentifier().identify_effect(model)
ate = DMLEstimator().estimate_ate(data, "fast_pacing", "ctr",
                                  ["platform", "audience_segment", "budget"])

refuter = CausalRefuter()
placebo = refuter.placebo_test(data, "fast_pacing", "ctr",
                               ["platform", "audience_segment", "budget"])
print(f"Effect: {ate.ate:+.3f}, Placebo passed: {placebo.passed}")
Brand extraction with conflict detection
profile = await extractor.extract("AcmeCorp", [
    Path("brand_guide.pdf"), Path("approved_ad.jpg"),
    Path("brand_video.mp4"), Path("jingle.mp3"),
])
print(f"Colors: {profile.visual_style.dominant_colors}")
print(f"Voice: {profile.voice.formality}, {profile.voice.emotional_register}")
print(f"Rules: {len(profile.rules)}")

# Update with new assets -- detects conflicts
updated, conflicts = await extractor.update(profile, [Path("new_campaign.jpg")])
for c in conflicts:
    print(f"  CONFLICT [{c.severity}] {c.dimension}: {c.existing_value} -> {c.new_value}")
Creative generation from causal insights
from omni_proof.api.generative_loop import GenerativePromptBuilder

builder = GenerativePromptBuilder()
prompt = builder.build_prompt(
    cate_insights=[{"treatment": "fast_pacing", "effect": 0.12}],
    brand_rules=[{"description": "Use blue (#004E89) as primary color"}],
    target_segment="18-24",
    objective="conversion",
    constraints=["16:9 aspect ratio", "max 15 seconds"],
)
CATE by segment + insight synthesis
cate = estimator.estimate_cate(data, "fast_pacing", "ctr",
                                confounder_cols=["platform", "budget"],
                                segment_col="audience_segment")
for segment, effect in cate.segments.items():
    print(f"  {segment}: {effect.effect:+.3f} (CI: {effect.ci_lower:.3f} to {effect.ci_upper:.3f})")

from omni_proof import InsightSynthesizer
brief = InsightSynthesizer(p_value_threshold=0.05).synthesize(cate)
print(f"{brief.finding} -> {brief.recommendation}")

API Reference

Method Endpoint Description
GET /health Health check
POST /api/v1/brand/extract Extract brand profile from uploaded assets
POST /api/v1/brand/update/{id} Update brand profile with new assets
GET /api/v1/brand/profile/{id} Retrieve a brand profile
POST /api/v1/compliance/check Check creative for brand compliance
GET /api/v1/compliance/reports Historical compliance reports
POST /api/v1/causal/analyze Trigger causal analysis
GET /api/v1/causal/effects List estimated causal effects
GET /api/v1/causal/effects/{treatment} CATE breakdown by segment
GET /api/v1/insights/briefs Design briefs from causal data
GET /api/v1/insights/segments Effects by audience segment
POST /api/v1/generative/prompt Generate optimized creative prompt

Causal Methodology

OmniProof implements a four-stage causal pipeline:

  1. Model -- Build a DAG mapping treatments, outcomes, and confounders
  2. Identify -- Apply the backdoor criterion to find valid adjustment sets
  3. Estimate -- Double Machine Learning (Neyman orthogonalization) via EconML
  4. Refute -- Placebo tests, subset validation, and random confounder checks

For visual embeddings where treatment and confounders are entangled, DICE-DML generates counterfactual pairs, isolates treatment fingerprints via vector subtraction, and applies orthogonal projection before estimation.

Gemini Embedding 2

All modalities map to the same 3072-dimensional semantic space via Gemini Embedding 2:

Modality Limit
Text 8,192 tokens
Images 6 per request
Video 80s (with audio) / 120s (without)
Audio 80s
PDF 1 document, 6 pages
Output 3,072 dims (Matryoshka: truncate to 1536 / 768 / 128)

Tech Stack

Component Technology
Embeddings Gemini Embedding 2
Structured extraction Gemini 3.1 Flash Lite
Vector DB Pinecone Serverless
Relational DB PostgreSQL / SQLite
Causal inference DoWhy + EconML
API FastAPI
ML models LightGBM
Schemas Pydantic v2 + SQLAlchemy 2.0

Testing

pytest tests/unit/ -v               # 140 unit tests
pytest tests/integration/ -v        # 17 integration tests
pytest tests/ -v                    # All 157 tests
ruff check src/ tests/              # Lint

Star History

Star History Chart

Contributing

See CONTRIBUTING.md for development setup, testing, and PR guidelines.

License

MIT -- OmniProof Contributors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omni_proof-0.0.1.tar.gz (169.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omni_proof-0.0.1-py3-none-any.whl (44.4 kB view details)

Uploaded Python 3

File details

Details for the file omni_proof-0.0.1.tar.gz.

File metadata

  • Download URL: omni_proof-0.0.1.tar.gz
  • Upload date:
  • Size: 169.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for omni_proof-0.0.1.tar.gz
Algorithm Hash digest
SHA256 8a5642f51165286a200febbc3e0a1814540a3fb7c3eb08d79d54aa1cf17116ec
MD5 23c95eaa40e3626394210ca6ea6b0447
BLAKE2b-256 e87c271fbb8a7193d8ab7a16019a6b1ea05ef5f6d2f7c3390eb8df85a2fdcf28

See more details on using hashes here.

Provenance

The following attestation bundles were made for omni_proof-0.0.1.tar.gz:

Publisher: release.yml on navidgh66/omni_proof

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file omni_proof-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: omni_proof-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 44.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for omni_proof-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e8b6c87ece3bda4bc93a2371bbf316dee74ea5a5455062d1846b1cfcb569d756
MD5 47712553925c6e85308478e1532eddb3
BLAKE2b-256 ae26e6c3f97ef5b632fa0cab684e1a7996e78f726f3aff5fd06709fc8d8ffd61

See more details on using hashes here.

Provenance

The following attestation bundles were made for omni_proof-0.0.1-py3-none-any.whl:

Publisher: release.yml on navidgh66/omni_proof

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page