Skip to main content

An agentic harness for marketing measurement: adstock, saturation, attribution, and budget allocation as typed agent tools with an eval suite.

Project description

MixPilot

An agentic harness for marketing measurement.

Everyone is building general-purpose agents. MixPilot is the opposite bet: a small, real agent runtime whose entire action space is the marketing-science toolkit — adstock, saturation, multi-touch attribution, budget optimisation, and a marketing-aware data-quality audit — wired as typed tools with structured results, an approval gate, and an eval suite that scores the agent's method-selection judgment.

The thesis: in a domain agent, the tools are the product. A model can sound confident about marketing data; the hard part is knowing which method the data can actually support — and refusing the ones it can't. That judgment is what MixPilot encodes and tests.

pip install mixpilot           # tools only
pip install "mixpilot[agent]"  # + the Anthropic-backed agent loop

What's inside

Component What it does The lesson
audit_data_quality nulls, negative spend, zero-variance, multicollinearity The audit decides whether any later number is trustworthy at all
adstock_transform geometric carryover with half-life Carryover before saturation, always
fit_saturation_curve Hill curve (β, α, γ) + R² A low-R² curve is a warning, not a result
run_attribution_model last-touch / linear / Markov removal-effect Match the model to the data shape, never overreach
allocate_budget constrained optimisation over saturation curves Move money toward unsaturated marginal return
ToolResult contract summary + next_actions + recovery_hint on every call A tool result is the next observation, not a log line
Approval gate mutating actions clear a policy gate outside the prompt Safety lives in code, not prose
Agent loop turn budget + loop detection + stop conditions The loop is a control system, not a while-tool-calls toy
Eval suite data shape → required/forbidden method Test the judgment, not the prose

The judgment, made concrete

Same data, two attribution models:

from mixpilot.tools.attribution import run_attribution_model

paths  = [{"path": ["Social", "Search"], "converted": True}] * 40
paths += [{"path": ["Social"],           "converted": False}] * 20

run_attribution_model(paths, "last_touch").artifacts["share"]
# {'Search': 1.0, 'Social': 0.0}   <- the assist is thrown away
run_attribution_model(paths, "markov").artifacts["share"]
# {'Search': 0.5, 'Social': 0.5}   <- both channels are necessary

Last-touch hands Social nothing. Markov's removal effect sees that every conversion needed Social first, and splits the credit. That gap — assist value — is the entire reason multi-touch attribution exists.

Running the agent

from mixpilot import Agent
from mixpilot.agent.client import AnthropicClient

agent = Agent(AnthropicClient())   # needs ANTHROPIC_API_KEY
result = agent.run(
    "Here are 12 weeks of spend and sales for one channel ... "
    "what's the saturation point and should we spend more?"
)
print(result.final_text)

The loop injects the client, so it runs against a real model in production and a scripted client in evals — no network needed to test the harness.

Evals: scoring method selection

python evals/run_evals.py          # offline, deterministic
python evals/run_evals.py --live   # against a real model

Each case pairs a data shape with the methods it does and does not support. A case passes only if the agent uses every required tool and avoids every forbidden one — e.g. it must not run Markov attribution on single-touch data, and must audit before claiming per-channel effects on collinear spend.

Tests

python -m pytest -q     # 10 passing — domain math + harness contracts

The tests never call a model. They prove the adstock conserves mass, the Hill fit recovers known parameters, the Markov removal effect credits assists, the optimiser moves budget toward unsaturated channels, the audit catches multicollinearity, and the approval gate blocks mutating actions. Harness reliability that has nothing to do with model intelligence.

Scope

This is deliberately small. No context compaction, no MCP, no subagents — those are solved problems in general harnesses. The point here is the opposite: how far you get when the action space is a real domain and the tools enforce the judgment.

MIT licensed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mixpilot-0.1.1.tar.gz (20.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mixpilot-0.1.1-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file mixpilot-0.1.1.tar.gz.

File metadata

  • Download URL: mixpilot-0.1.1.tar.gz
  • Upload date:
  • Size: 20.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for mixpilot-0.1.1.tar.gz
Algorithm Hash digest
SHA256 ecb51e368abd21583df9e819a7df1e3a1def2a2380f1699058fc4090c75349a9
MD5 4fa3af13e7f345eb08282c78cc5c7da7
BLAKE2b-256 e766b2a458b10f117208581bcb48f4cc56ead5cd3a0c60a48d843741777c0e62

See more details on using hashes here.

File details

Details for the file mixpilot-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: mixpilot-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for mixpilot-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 db4c7e3c9f6f397fb4da9901f2d3ec9bc3121c8df6f385306e8d94138a08db60
MD5 ada23a2762f517b5c36c73d9e056ee4a
BLAKE2b-256 c5981ec45deb409262bad5648584fb6f8f7ebb82cc8e87e26735f7516658ce19

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page