Exploratory data analysis and transformation toolkit for Marketing Mix Modeling (MMM)

These details have not been verified by PyPI

Project links

Project description

🦉 OwlMix

OwlMix is a comprehensive Python package for Exploratory Data Analysis (EDA) and data transformation tailored for Marketing Mix Modeling (MMM) workflows. It provides automated report generation, statistical analysis, and data transformation utilities to accelerate MMM projects.

🚀 Key Features

📊 Data Analysis & Reporting

Automated EDA Reports: Generate professional HTML and JSON reports with comprehensive statistics and visualizations
Correlation Analysis: Matrix correlations, lag correlations, and ACF/PACF analysis for time series
VIF Calculation: Variance Inflation Factor detection for multicollinearity assessment
Causality Testing: Granger causality tests to identify causal relationships
Categorical Analysis: Distribution analysis for categorical variables
KPI vs Features: Analyze relationships between KPIs and marketing features by time period
Time Series Decomposition: Seasonal decomposition and trend analysis
Outlier Detection: Visual identification and analysis of outliers

🔧 Data Transformation

Adstock Effect: Apply advertising carryover effects to media spend data
Lag Generation: Create lagged features for time series modeling
Saturation Transformation: Apply saturation curves (Hill, Logistic, Logit) to media variables
Data Cleanup: Automated data quality checks and handling (missing values, duplicates, etc.)
Transformation Pipeline: Chainable pipeline for complex data workflows

🎨 Visual & Export Options

Multiple HTML Templates: Light and dark theme templates for reports
Interactive Charts: Distribution plots, time series, correlation heatmaps, outlier charts
JSON Export: Raw report data for programmatic access
Chart Storage: Automatic chart generation and storage in outputs/charts/

⚙️ Flexible Configuration

Fine-grained control over analyses to include/exclude
Customizable precision, date formats, and aggregation frequencies
Column-specific configurations for targeted analysis
Template customization support

📦 Installation

pip install owl-mix

Requirements:

Python >= 3.12
pandas >= 1.5
matplotlib >= 3.7
seaborn >= 0.12
statsmodels >= 0.14.6
scipy >= 1.10
scikit-learn>=1.8.0
Jinja2 >= 3.1

⚡ Quick Start

Basic EDA Report Generation

import pandas as pd
from owlmix.report import OwlMixReport

# Load your data
df = pd.read_csv("your_data.csv")

# Create and generate report
report = OwlMixReport(
    df=df,
    target="sales",              # Target variable for analysis
    date_column="date",          # Date column for time series analysis
    template_name="custom_eda_template.html"  # Optional: use "custom_eda_template_dark.html" for dark theme
)

# Generate HTML and JSON reports
report.run(
    json_file_name="eda_report.json",
    html_file_name="eda_report.html"
)

Output:

eda_report.json: Structured analysis data in JSON format
eda_report.html: Interactive HTML report with charts and statistics
outputs/charts/: Generated visualization files

🛠️ Advanced Configuration

Handling Categorical Variables

To get the most out of your analysis, it is essential to explicitly define your categorical columns. If these are not set, OwlMix will not generate categorical distribution charts in the final report.

Why This Is Important

Explicitly defining categorical variables ensures that the OwlMix engine:

Generates Visualizations: Triggers the creation of frequency and distribution charts in the HTML output.
Ensures Data Integrity: Correctly interprets columns as discrete categories (e.g., store_id or product_code) even if they contain numerical values.

Usage

Use the update_categorical_columns_config method after initializing your report object, but before calling .run().

import pandas as pd
from owlmix.report import OwlMixReport

# Load your data
df = pd.read_csv("data.csv")

# Initialize the report
report = OwlMixReport(
    df=df,
    target="sales",
    date_column="date"
)

# Define your categorical features
cat_cols = ["color", "smartphone", "car_model", "language"]

# Update the configuration
# Without this line, distribution charts for these columns will be skipped
report.config.update_categorical_columns_config(columns=cat_cols)

# Run the report
report.run(html_file_name="report.html")

Note: If you find that specific charts are missing from your HTML report, double-check that the column names in your list exactly match the headers in your DataFrame.

Customising Report Charts (Include, Exclude, & Reorder)

You can control exactly which visualisations appear in your report and the order in which they are displayed using the summary_builder attributes. This is useful for removing noise or prioritising the most important insights for your stakeholders.

Chart Management Options

Exclude: Remove specific charts you don't need (e.g., removing Correlation if it's not relevant).
Include: Explicitly whitelist only the charts you want to see.
Reorder: Define a custom sequence for the charts in the HTML output.

Usage

Use the ChartID enum to specify which charts to modify. These settings must be applied to report.summary_builder before calling .run().

from owlmix.report import OwlMixReport
from owlmix.typing.enums import ChartID

report = OwlMixReport(df=df, target="sales", date_column="time")

# 1. Exclude specific charts
report.summary_builder.exclude_charts = [
    ChartID.CORRELATION_CHART, 
    ChartID.COMPARISON_CHART
]

# 2. OR Include ONLY specific charts (Whitelisting)
# report.summary_builder.include_charts = [
#     ChartID.CORRELATION_CHART, 
#     ChartID.ACF_PACF_CHART
# ]

# 3. Reorder charts
# The report will follow the exact order of the list provided
report.summary_builder.reorder_charts = [
    ChartID.DISTRIBUTION_CHART,
    ChartID.TIME_SERIES_CHART,
    ChartID.CORRELATION_CHART
]

report.run(save_json=True)

Key Rules

Precedence: If you set include_charts, OwlMix will prioritize that list and ignore exclusions outside of it.
Enum Usage: Always use the ChartID enum to reference charts to avoid string typos and ensure compatibility with future updates.

Time based comparison table and chart

⚠️ Important Notes

YOY (week-level) can be tricky
- Some years have 53 weeks, others have 52
- ISO week numbering does not perfectly align with calendar dates
- The same week number across years may represent slightly different date ranges
- This can lead to minor inconsistencies in YoY week comparisons

📊 Supported Comparison Types

yoy_year
- Granularity: Year
- Comparison: Current year vs previous year
mom
- Granularity: Month (YYYY-MM)
- Comparison: Current month vs previous month
wow
- Granularity: Week (week start date)
- Comparison: Current week vs previous week
qoq
- Granularity: Quarter (YYYYQX)
- Comparison: Current quarter vs previous quarter
yoy_month
- Granularity: Month
- Comparison: Same month across years (e.g., Jan 2024 vs Jan 2023)
yoy_quarter
- Granularity: Quarter
- Comparison: Same quarter across years (e.g., Q1 2024 vs Q1 2023)
yoy_week
- Granularity: ISO Week
- Comparison: Same week number across years

OwlMix Configuration API Reference

OwlMix provides a comprehensive suite of update_* methods to fine-tune your analysis. These methods allow you to modify statistical parameters, chart aesthetics, and data processing logic.

Implementation Pattern

All configuration updates must be performed on the report.config object after initialization and before calling report.run().

report = OwlMixReport(df=df, target="sales", date_column="date")

# Example: Chaining configuration updates
report.config.update_categorical_columns_config(columns=["brand", "store_id"]) \
             .update_correlation_config(method="pearson") \
             .update_acf_pacf_config(lags=40)

Available Update Methods & Parameters

Method	Description	Parameters (Keyword Arguments)
`update_categorical_columns_config`	Essential: Defines columns for categorical analysis.	`columns`
`update_time_series_config`	Configures the primary time series visualization.	`columns`, `model`, `period`
`update_time_aggregator_config`	Controls how data is grouped and aggregated.	`date_column`, `value_columns`, `agg_func`, `precision`, `freq`
`update_time_comparison_config`	Defines logic for PoP or YoY comparisons.	`date_column`, `value_columns`, `comparison_type`, `agg_func`, `precision`, `freq`
`update_time_comparison_chart_config`	Adjusts the visual layout of comparison charts.	`date_column`, `value_columns`, `comparison_type`, `agg_func`, `mode`
`update_correlation_config`	Sets parameters for correlation analysis.	`columns`
`update_correlation_chart_layout_config`	Customizes the heatmap UI and labels.	`columns`
`update_lag_corr_chart_config`	Configures cross-correlation with time lags.	`column` (required), `lag`
`update_acf_pacf_config`	Adjusts lags and markers for ACF/PACF plots.	`columns`, `n_lags`, `acf_marker`, `pacf_marker`, `acf_stem`, `pacf_stem`, `acf_conf`, `pacf_conf`
`update_distribution_chart_config`	Sets binning logic and chart grid layout.	`columns`, `max_charts_per_row`
`update_kpi_vs_feature_config`	Configures analysis of Target vs Features.	`target_column`, `columns`, `period`, `date_column`, `agg_func`
`update_causality_test_config`	Fine-tunes Granger causality test parameters.	`target_column`, `columns`, `max_lag`, `error_threshold`, `p_value_weight`, `mape_weight`
`update_vif_config`	Configures multicollinearity detection.	`target_column`, `features`, `precision`, `color_thresholds`
`update_outlier_chart_layout_config`	Adjusts outlier detection and visual markers.	`columns`, `max_cols_per_chart`, `single_image`

Quick Usage Tip

When passing values to these methods, ensure you use the argument names exactly as listed. For example:

report.config.update_vif_config(
    features=["price", "inventory", "promotion"],
    precision=2
)

Pro-Tips

Method Chaining: These methods return self, so you can chain multiple updates together for cleaner code.
Validation: If a chart is missing from your report, verify that its corresponding update_ method has been called with the correct column names.
Enums: For methods like update_correlation_config, it is recommended to use the built-in owlmix.typing.enums to ensure parameter validity.

Custom VIF Color Thresholds

The Variance Inflation Factor (VIF) is used to detect multicollinearity among features. To make the report more intuitive, OwlMix allows you to define Rule-Based Coloring. This feature applies specific colors to VIF bars based on their numerical value.

Why Customize Thresholds?

Different industries have different tolerances for multicollinearity. While a VIF of 5 is often considered "high," some models require stricter thresholds (e.g., 2.5) or allow for more leniency. Custom colors help stakeholders instantly identify "Safe," "Warning," or "Critical" variables.

Usage

You can pass a list of tuples to the color_thresholds parameter within update_vif_config. Each tuple should follow the format: (upper_bound, "color_name_or_hex").

# 1. Define your rules (value, color)
# The rule applies if the VIF value is less than or equal to the threshold
vif_color_rules = [
    (2, "blue"),              # Safe: VIF <= 2
    (5, "green"),             # Moderate: 2 < VIF <= 5
    (6, "yellow"),            # Warning: 5 < VIF <= 6
    (10, "red"),              # High: 6 < VIF <= 10
    (float("inf"), "darkred") # Critical: VIF > 10
]

# 2. Update the VIF configuration
report.config.update_vif_config(color_thresholds=vif_color_rules)

# 3. Run the report
report.run(html_file_name="vif_analysis.html")

Key Rules

Order Matters: List your thresholds in ascending order. OwlMix evaluates these rules sequentially.
Infinity: Use float("inf") as the final threshold to catch all values exceeding your last defined limit.
Color Support: You can use standard CSS color names (e.g., "red", "orange") or Hex codes (e.g., "#FF5733").

Default Behavior: If no custom thresholds are provided, OwlMix uses a standard internal color palette to differentiate VIF levels.

Working with Enums

OwlMix uses Enums (Enumerations) to standardize configuration values. Using these instead of raw strings prevents typos and ensures your code remains compatible with future versions.

Key Enum Reference Table

Enum Class	Purpose	All Available Values
`ChartID`	Controlling chart visibility and order.	`VIF_CHART`, `ACF_PACF_CHART`, `KPI_VS_FEATURE_CHART`, `DISTRIBUTION_CHART`, `CATEGORICAL_DISTRIBUTION_CHART`, `CORRELATION_CHART`, `LAG_CORRELATION_CHART`, `TIME_SERIES_CHART`, `OUTLIERS_CHART`, `COMPARISON_CHART`
`Period`	Defining data aggregation frequency.	`DAILY`, `WEEKLY`, `MONTHLY`, `YEARLY`
`ComparisonType`	Setting logic for time-based comparisons.	`YoY`, `QoQ`, `MoM`, `WoW`, `YoY_MONTH`, `YoY_QUARTER`, `YoY_WEEK`
`PlotMode`	Choosing the visual axis/unit style.	`ABSOLUTE`, `PCT_CHANGE`, `DUAL`

Implementation Guide

Basic Usage

Always import the Enum classes from owlmix.typing.enums.

from owlmix.typing.enums import ChartID, ComparisonType, PlotMode

# Example: Filtering and Reordering
report.summary_builder.reorder_charts = [
    ChartID.TIME_SERIES_CHART,
    ChartID.COMPARISON_CHART,
    ChartID.CORRELATION_CHART
]

# Example: Setting Comparison Logic
report.config.update_time_comparison_config(
    comparison_type=ComparisonType.YoY_MONTH
)

Inspecting Enum Data

Since all Enums inherit from BaseEnum, you can programmatically inspect them if you are unsure of the underlying values or labels.

# Returns a list of strings: ['DAILY', 'WEEKLY', 'MONTHLY', 'YEARLY']
print(Period.names())

# Returns a list of raw values: ['daily', 'weekly', 'monthly', 'yearly']
print(Period.values())

# Returns a formatted JSON string of IDs, Names, and Labels
print(ComparisonType.pretty_options())

Pro Tip: Use .label if you need the human-readable version for your own custom logs or UI (e.g., ComparisonType.YoY.label returns "Year over Year").

Configuration Management with File Resolver

The ConfigFileResolver utility simplifies managing configuration files by automatically resolving file references in JSON configs. This is useful for keeping configuration data organized across multiple files.

from owlmix.file_resolver import ConfigFileResolver

# Create a resolver with a JSON config file
resolver = ConfigFileResolver(config="config.json")

# Resolve *_file keys to their actual content
resolved_config = resolver.resolve()

# Save the resolved config
resolver.save("resolved_config.json")

# Get as Python dictionary string
python_dict_string = resolver.to_python_string()
print(python_dict_string)

# Print formatted output
resolver.print()

How it works:

Any JSON key ending with _file is automatically resolved to the file's content
Supports any file type (HTML, TXT, MD, JSON, etc.)
Works recursively through nested dictionaries and lists
Includes built-in caching for efficiency

Example Configuration:

{
    "report_template": {
        "description_file": "templates/report_description.html",
        "title": "Analysis Report",
        "metadata_file": "config/metadata.json"
    }
}

After resolution, description_file key becomes description with the HTML file's content, and metadata_file becomes metadata with the JSON content.

📊 Report Sections

The generated HTML report includes comprehensive sections:

Section	Description
Dataset Overview	Basic information, data types, missing values, memory usage
Summary Statistics	Descriptive statistics (mean, std, min, max, quantiles)
Data Quality	Missing value patterns, duplicate analysis
Distributions	Histograms and density plots for all numeric variables
Outlier Analysis	Box plots and outlier identification
Correlation Matrix	Pairwise correlations with heatmap visualization
Lag Correlations	Time-lagged correlation analysis for time series
VIF Analysis	Multicollinearity detection using Variance Inflation Factor
ACF/PACF	Autocorrelation and partial autocorrelation for seasonality detection
Causality Tests	Granger causality tests for causal relationships
Time Comparisons	Period-over-period comparisons (YoY, MoM)
KPI vs Features	Relationship between target and marketing features over time
Categorical Distributions	Distribution analysis for categorical variables

🔧 Core Modules

`owlmix.eda`

Exploratory Data Analysis module with:

SummaryBuilder: Comprehensive summary generation
OwlMixEDA: Main EDA orchestrator

Features:

Correlation analysis (matrix, lag, causality)
VIF calculation for multicollinearity
ACF/PACF analysis for seasonality
Categorical and distribution analysis
Outlier detection and visualization

`owlmix.transform`

Data transformation module for MMM preprocessing:

adstock(): Apply advertising carryover effects
create_lags(): Generate lagged features
saturation(): Apply saturation curves (Hill, Logistic, Logit)
cleanup_data(): Data quality utilities
MMMTransformPipeline: Chainable pipeline for complex workflows

`owlmix.report`

Report generation module:

OwlMixReport: Main report generator
HTML template rendering with customizable themes
JSON data export
Chart generation and storage

📈 Example Use Cases

Marketing Mix Modeling Workflow

import pandas as pd
from owlmix.report import OwlMixReport
from owlmix.transform import MMMTransformPipeline

# Load raw data
df = pd.read_csv("mmm_data.csv")

# Step 1: Transform data
pipeline = MMMTransformPipeline(df, date_column="date")
pipeline.adstock(columns=["tv", "digital", "radio"], decay_rate=0.5)
pipeline.create_lags(columns=["sales"], lags=[1, 4, 13])
df_transformed = pipeline.get_data()

# Step 2: Analyze with EDA
report = OwlMixReport(
    df=df_transformed,
    target="sales",
    date_column="date"
)
report.config.set_vif_config(
    features=["tv", "digital", "radio"],
    precision=3
)
report.run(
    json_file_name="mmm_eda.json",
    html_file_name="mmm_eda.html"
)

📚 Documentation

💡 Contributing

Contributions are welcome! Please feel free to submit pull requests or open issues on GitHub.

📄 License

MIT License - see LICENSE file for details

Author: Sarbadal Pal (sarbadal@gmail.com)

Repository: github.com/sarbadal/owl-mix

📚 Documentation

Detailed documentation is available in the docs/ folder:

docs/eda.md → EDA module details
docs/transform.md → Data transformation features
docs/saturation.md → Saturation modeling
docs/include_exclude_reorder_charts.md → Include Exclude and Reorder charts

🧠 Use Case: Marketing Mix Modeling

OwlMix is designed for MMM workflows where you need to:

Explore relationships between marketing spend and sales
Identify multicollinearity issues with VIF
Analyze time-based patterns and correlations
Generate professional reports for stakeholders

Perfect for preprocessing data before building MMM models!

Owl Mix is particularly useful for:

Preprocessing marketing data
Feature engineering for MMM
Understanding lagged media effects
Generating EDA reports before modeling

🔧 Roadmap

Planned enhancements:

Visualization support (plots, heatmaps)
HTML report generation
Automated MMM diagnostics
CLI support

🤝 Contributing

Contributions are welcome!

Feel free to:

Open issues
Suggest features
Submit pull requests

📄 License

This project is licensed under the MIT License.

⭐ Support

If you find this project useful, consider giving it a star ⭐ on GitHub!

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0rc4 pre-release

May 6, 2026

0.2.0rc3 pre-release

May 6, 2026

0.2.0rc2 pre-release

May 5, 2026

0.2.0rc1 pre-release

Apr 28, 2026

0.1.9

Apr 24, 2026

0.1.8

Apr 24, 2026

0.1.7

Apr 24, 2026

0.1.6

Apr 23, 2026

0.1.5

Apr 23, 2026

0.1.4

Apr 18, 2026

0.1.3

Apr 16, 2026

0.1.2

Apr 16, 2026

0.1.1

Apr 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

owl_mix-0.2.0rc4.tar.gz (70.5 kB view details)

Uploaded May 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

owl_mix-0.2.0rc4-py3-none-any.whl (91.9 kB view details)

Uploaded May 6, 2026 Python 3

File details

Details for the file owl_mix-0.2.0rc4.tar.gz.

File metadata

Download URL: owl_mix-0.2.0rc4.tar.gz
Upload date: May 6, 2026
Size: 70.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for owl_mix-0.2.0rc4.tar.gz
Algorithm	Hash digest
SHA256	`64a576ea0f9f1b227155dbc81f5886e70ce1f1df191a62d39367aaad2f4490fc`
MD5	`50a70564d8e58ebea60ff3cd1ae47e93`
BLAKE2b-256	`d463c7c4573910180d5bcc94e513bcb56a7ef2c9d680caeeae2f5e6e6442bb94`

See more details on using hashes here.

File details

Details for the file owl_mix-0.2.0rc4-py3-none-any.whl.

File metadata

Download URL: owl_mix-0.2.0rc4-py3-none-any.whl
Upload date: May 6, 2026
Size: 91.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for owl_mix-0.2.0rc4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`29d97c68feea85e3adc988a205c5ce224171253f4213ef771d45aabc77d44ac4`
MD5	`47387c00c3022658bfc6c9ef6e347e7e`
BLAKE2b-256	`455e87a4dedcdb5e37ac8c66aa7358dba5687b14e3f6518b59805f3cc394a3f8`

See more details on using hashes here.

owl-mix 0.2.0rc4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🦉 OwlMix

🚀 Key Features

📊 Data Analysis & Reporting

🔧 Data Transformation

🎨 Visual & Export Options

⚙️ Flexible Configuration

📦 Installation

⚡ Quick Start

Basic EDA Report Generation

🛠️ Advanced Configuration

Handling Categorical Variables

Why This Is Important

Usage

Customising Report Charts (Include, Exclude, & Reorder)

Chart Management Options

Usage

Key Rules

Time based comparison table and chart

⚠️ Important Notes

📊 Supported Comparison Types

OwlMix Configuration API Reference

Implementation Pattern

Available Update Methods & Parameters

Quick Usage Tip

Pro-Tips

Custom VIF Color Thresholds

Why Customize Thresholds?

Usage

Key Rules

Working with Enums

Key Enum Reference Table

Implementation Guide

Basic Usage

Inspecting Enum Data

Configuration Management with File Resolver

📊 Report Sections

🔧 Core Modules

owlmix.eda

owlmix.transform

owlmix.report

📈 Example Use Cases

Marketing Mix Modeling Workflow

📚 Documentation

💡 Contributing

📄 License

📚 Documentation

🧠 Use Case: Marketing Mix Modeling

🔧 Roadmap

🤝 Contributing

📄 License

This project is licensed under the MIT License.

⭐ Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`owlmix.eda`

`owlmix.transform`

`owlmix.report`