Dynamic MDF synthetic market data generator
Project description
market-wave
Fast, lightweight synthetic market data from a Dynamic Market Distribution Function.
English | 한국어
market-wave is a Python library for generating synthetic market paths from
market-wide entry and exit intent. It does not create individual participants.
Instead, it models aggregate buy/sell entry intent, position exits,
order-book depth, cancellations, taker flow, and execution-driven price movement
from probability mass over relative ticks.
It is not a forecasting model. It is a lightweight simulation primitive for experiments, visualization, teaching, and strategy-environment prototyping.
Why market-wave?
- Aggregate intent, not agents: market participants are represented by probability mass over relative ticks, not by individual objects.
- Dynamic MDF: entry and exit intent live in four stateful
MDF(relative_tick)fields that evolve from the previous step. - Pluggable score model: swap the MDF score function with
DynamicMDFModelor a customMDFModel. - Separated shape and size: MDFs decide where intent sits; intensity decides how much order flow appears.
- Execution-driven prices: prices stay flat unless trades execute.
- Batch generation: generate many reproducible synthetic paths without
keeping every path in
market.history. - Inspectable state: every step returns a
StepInfosnapshot with MDFs, volumes, order book state, position mass, VWAP, spread, and imbalance. - Built-in plotting:
matplotlibis included, with a clean light chart style by default.
Install
pip install market-wave
For dataframe export:
pip install "market-wave[dataframe]"
For local development:
git clone https://github.com/smturtle2/market-wave.git
cd market-wave
uv sync --extra dev
Python >=3.10 is supported.
Quickstart
from market_wave import Market
market = Market(
initial_price=10_000,
gap=10,
popularity=1.0,
seed=42,
)
steps = market.step(500)
last = steps[-1]
print(last.price_before, "->", last.price_after)
print("entry:", round(sum(last.entry_volume_by_price.values()), 3))
print("executed:", round(last.total_executed_volume, 3))
print("resting bid/ask:", round(sum(last.orderbook_after.bid_volume_by_price.values()), 3), round(sum(last.orderbook_after.ask_volume_by_price.values()), 3))
print("imbalance:", round(last.order_flow_imbalance, 3))
Market.step(n) always returns list[StepInfo] and appends the same objects to
market.history.
For high-volume generation, skip in-memory history:
steps = market.step(512, keep_history=False)
for step in market.stream(512, keep_history=False):
consume(step)
For simple export workflows, use step.to_dict(), step.to_json(), or
market.history_records().
Example output with seed=42:
10020.0 -> 10010.0
entry: 2.955
executed: 0.976
resting bid/ask: 24.662 25.83
imbalance: 0.484
Smoke Matrix
The simulator is deterministic for a fixed seed, so it is easy to run the same invariants across different market conditions:
from market_wave import Market
cases = [
("baseline", dict(initial_price=10_000, gap=10, popularity=1.0, seed=42), 500),
("busy", dict(initial_price=10_000, gap=10, popularity=2.5, seed=7), 500),
("thin", dict(initial_price=500, gap=5, popularity=0.25, seed=123), 500),
("low_price", dict(initial_price=1, gap=1, popularity=3.0, seed=17), 500),
("trend_up", dict(initial_price=10_000, gap=10, popularity=1.0, seed=42, regime="trend_up"), 500),
("high_vol", dict(initial_price=10_000, gap=10, popularity=1.0, seed=7, regime="high_vol"), 500),
("inactive", dict(initial_price=100, gap=1, popularity=0.0, seed=9), 100),
]
for name, kwargs, steps_count in cases:
market = Market(**kwargs)
steps = market.step(steps_count)
prices = [step.price_after for step in steps]
move_steps = sum(step.price_change != 0 for step in steps)
exec_steps = sum(step.total_executed_volume > 0 for step in steps)
print(name, min(prices), max(prices), move_steps, exec_steps, market.state.price)
Recent verification on the current implementation:
baseline range= 9900.0- 10080.0 unique= 19 moves=369 exec_steps=500 final= 10010.0
busy range= 9890.0- 10030.0 unique= 15 moves=388 exec_steps=500 final= 9910.0
thin range= 455.0- 535.0 unique= 17 moves=318 exec_steps=500 final= 500.0
low_price range= 2.0- 24.0 unique= 23 moves=380 exec_steps=500 final= 19.0
trend_up range= 10000.0- 10320.0 unique= 33 moves=388 exec_steps=500 final= 10320.0
high_vol range= 9960.0- 10040.0 unique= 9 moves=405 exec_steps=500 final= 9970.0
inactive range= 100.0- 100.0 unique= 1 moves= 0 exec_steps= 0 final= 100.0
Those runs also checked that current-state MDF projections stay aligned with
state.price_grid, MDFs remain normalized, prices never fall below one tick,
order book and position mass stay non-negative, and price changes only occur on
steps with executed volume. Dynamic MDF acceptance also runs seeds 10..19 at
mdf_temperature=1.0 and checks that every MDF remains finite, non-negative,
normalized, and broad enough not to collapse to a single price.
Diagnostic note for 0.4.1: the simulator still has no anchor price or stored
target that pulls paths back to the initial price. Seeded mood, trend,
volatility, microstructure activity, cancellation pressure, and event pressure
evolve each step and reshape the MDFs and visible book. Prices remain
execution-driven, with a small flow-implied price-discovery component when
executed flow reveals one-sided pressure. Treat these ranges, move counts, and
execution counts as regression diagnostics, not claims that generated paths
match any specific real market.
Entry MDF prices are treated as incoming order prices. Buy entry orders arrive as bids, sell entry orders arrive as asks, and they execute only when they overlap existing opposite-side quotes. Executions print at the resting quote price. Unfilled volume remains in the book at the sampled MDF price. Exit flow is cohort-conditioned, so exit orders carry the originating cohort id and still route through visible order-book liquidity.
MDF note for 0.4.1: the default entry MDF now uses a side-relative
reservation-price mixture. Buy entry intent is spread across deep value,
passive bid, arrival, and small chase zones; sell entry intent mirrors that on
the ask side. This keeps passive limit interest in the MDF itself instead of
adding synthetic fixed walls, while reducing excessive buy-ask and sell-bid
tail mass.
Microstructure note for 0.4.0: order-book replenishment now includes
regime-specific depth shape, resiliency, wall memory by absolute tick,
event-driven volume bursts, dry-up after cancellation pressure, trend
exhaustion, and squeeze pressure derived from short crowding plus recent
one-sided flow. Live order-book and position totals remain cached by price/side,
lots are coalesced by price/kind, and position inventory is kept in bounded
entry-price cohort buckets.
Visualization
from market_wave import Market
market = Market(
initial_price=10_000,
gap=10,
popularity=1.0,
seed=42,
)
market.step(500)
fig, ax = market.plot(last=220, orderbook_depth=12)
The default market_wave style uses a light multi-panel chart: price/VWAP,
bid and ask orderbook depth heatmaps by simple level, executed volume, and
order-flow imbalance. To keep the legacy three-panel view, pass
orderbook=False.
Dark overlay mode is still available:
fig, ax = market.plot(layout="overlay", style="market_wave_dark")
Synthetic Data
from market_wave import compute_metrics, generate_paths
paths = generate_paths(
n_paths=100,
horizon=512,
config_sampler=lambda path_id: {
"initial_price": 10_000,
"gap": 10,
"popularity": 1.0,
"seed": 10_000 + path_id,
},
)
metrics = compute_metrics(paths)
print(metrics.return_std, metrics.volume_mean, metrics.max_drawdown)
print(paths[0].metadata.config_hash)
GeneratedPath.metadata stores seed, config_hash, package version,
regime, and augmentation_strength so synthetic runs can be traced. Pandas is
optional: install market-wave[dataframe] to use to_dataframe().
ValidationMetrics.volatility_clustering_score is computed within each generated
path and aggregated, so independent path boundaries do not affect the diagnostic.
Pluggable MDF
from market_wave import Market
class CenterSeekingMDF:
def scores(self, side, intent, relative_ticks, context, signals=None):
del side, intent, context, signals
return [-abs(tick) for tick in relative_ticks]
market = Market(initial_price=100, gap=1, mdf_model=CenterSeekingMDF(), seed=7)
step = market.step(1)[0]
print(step.relative_ticks)
print(step.buy_entry_mdf)
Custom MDF models return scores, not probabilities. Treat each score as
log-growth evidence: additive score differences become multiplicative changes
to the previous MDF. Market applies those scores through the stabilized MDF
update described below.
Core Concepts
At every step, the market builds relative ticks around the current price:
relative_tick = (price - current_price) / tick_size
relative_ticks = [-grid_radius, ..., 0, ..., +grid_radius]
The simulator maintains four Market Distribution Functions on that relative grid:
buy_entry_mdfsell_entry_mdflong_exit_mdfshort_exit_mdf
Each MDF is normalized. It is not recreated from scratch each step; it evolves from the previous MDF:
logits = persistence * log(MDF_prev(tick) + eps)
+ score(tick) / effective_temperature
proposal = softmax(clamp(logits - max(logits), -50, 0))
MDF_next = Normalize((1 - floor_mix) * Diffuse(proposal) + floor_mix * Uniform)
score(tick) can include placement shape, trend, liquidity attraction, memory,
risk, and order-book imbalance. mdf_temperature controls how sharply scores
reshape the distribution. The effective temperature also includes current volatility, so
high-volatility regimes soften score updates instead of letting one tick absorb
all mass. Persistence, diffusion, and uniform floor mixing prevent repeated
small score advantages from collapsing the MDF into a single tick.
Those relative MDFs are projected onto the pre-trade grid
price_grid = price_before +/- k * gap for order-book formation.
StepInfo.mdf_price_basis records that pre-trade price basis.
low temperature -> sharper, concentrated MDF
high temperature -> wider, smoother MDF
MDFs generate aggregate intent. Intensity controls total size. The order book and execution layer then turn that intent into limit flow, taker flow, cancellations, exits, matched volume, and price changes.
Execution Guarantee
Price movement is execution-driven:
- If a step has no executed volume,
price_after == price_before. - If trades execute,
price_afteris derived from that step's execution statistics. Random quote jitter is bounded and cannot move the price by itself when executions print at the previous price. seedmakes the simulation reproducible for the same version and inputs.
This is a simulator, not a market data replay engine and not financial advice.
API Overview
from market_wave import (
Market,
DynamicMDFModel,
generate_paths,
compute_metrics,
MarketState,
IntensityState,
LatentState,
MDFContext,
MDFSignals,
MDFModel,
RelativeMDFComponent,
MDFState,
OrderBookState,
PositionMassState,
StepInfo,
)
Useful StepInfo fields include:
price_before,price_after,price_changetick_before,tick_after,tick_change,relative_ticksmdf_price_basis,price_gridbuy_entry_mdf,sell_entry_mdf,long_exit_mdf,short_exit_mdfbuy_entry_mdf_by_price,sell_entry_mdf_by_priceentry_volume_by_price,exit_volume_by_pricebuy_volume_by_price,sell_volume_by_priceexecuted_volume_by_price,total_executed_volume,trade_countmarket_buy_volume,market_sell_volumevwap_price,best_bid_before,best_ask_before,spread_afterorderbook_before,orderbook_afterposition_mass_before,position_mass_after
buy_volume_by_price and sell_volume_by_price are submitted side-intent maps
keyed by sampled order price, not executed or resting liquidity. market_*
volume fields report the executed incoming buy/sell volume. Unfilled incoming
volume rests in orderbook_after; legacy residual_market_* and
crossed_market_volume fields remain compatibility zeroes in the current
order-book-first engine.
The *_mdf_by_price fields are pre-trade MDF projections keyed by
mdf_price_basis; current Market.state.mdf.*_by_price is reprojected to the
post-trade state price. Examples and public APIs use MDF names only; stale PMF
examples from earlier prototypes should be considered obsolete.
Public Contract and Snapshot Policy
The public import surface is the package __all__: Market, generate_paths,
compute_metrics, generated-path metadata, MDF model/protocol types, metrics,
and the state dataclasses shown above. The entrypoints are intentionally small,
but the observation contract is broad because StepInfo and MarketState
expose detailed simulator diagnostics.
During the current alpha line, existing public names and existing StepInfo /
state fields are kept compatible where practical. New diagnostic fields may be
added in alpha releases. MDF names are the supported public distribution names;
stale PMF names from earlier prototypes are obsolete.
Snapshot mutability: state dataclasses are frozen=True at the attribute level,
but nested dict and list fields are plain mutable containers so to_dict()
and JSON export remain simple. Treat Market.state, StepInfo, and
GeneratedPath.hidden_states as read-only observations. Use Market.snapshot()
when downstream code needs a mutation-safe deep copy of the current state.
Compatibility note: Market.state remains available as the live current-state
attribute for the alpha line. Future releases may add a more explicit read-model
API or deprecation path for code that mutates state containers in place.
Development
uv sync --extra dev --extra dataframe
uv run ruff check .
uv run pytest
uv build
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file market_wave-0.4.1.tar.gz.
File metadata
- Download URL: market_wave-0.4.1.tar.gz
- Upload date:
- Size: 1.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b5426b6ec7a57c2fec5f7dc55587161809a4809d68df64dfcc6fae02a18760e2
|
|
| MD5 |
41069fdbe59ce4ffdf46035f65a1997e
|
|
| BLAKE2b-256 |
bd78752cb43c4c678d6a6b85d8dcb3a3153bcde03bd006a2ebb5fb4d84bc6ba2
|
Provenance
The following attestation bundles were made for market_wave-0.4.1.tar.gz:
Publisher:
workflow.yml on smturtle2/market-wave
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
market_wave-0.4.1.tar.gz -
Subject digest:
b5426b6ec7a57c2fec5f7dc55587161809a4809d68df64dfcc6fae02a18760e2 - Sigstore transparency entry: 1393055269
- Sigstore integration time:
-
Permalink:
smturtle2/market-wave@b288a595afd4b99efbb735fdc5973fb1adbd5f11 -
Branch / Tag:
refs/tags/v0.4.1 - Owner: https://github.com/smturtle2
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@b288a595afd4b99efbb735fdc5973fb1adbd5f11 -
Trigger Event:
push
-
Statement type:
File details
Details for the file market_wave-0.4.1-py3-none-any.whl.
File metadata
- Download URL: market_wave-0.4.1-py3-none-any.whl
- Upload date:
- Size: 35.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3bc6dfb9aefc86690f241f5470e80e986f2f5a02fc12861ba7aa3d7733b3f534
|
|
| MD5 |
c8701e2827caf84ee60295448e3f453a
|
|
| BLAKE2b-256 |
dc4d6815f1f0b84d5ab47a1f478da659acddfe6d4906a5f218e577587be543e6
|
Provenance
The following attestation bundles were made for market_wave-0.4.1-py3-none-any.whl:
Publisher:
workflow.yml on smturtle2/market-wave
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
market_wave-0.4.1-py3-none-any.whl -
Subject digest:
3bc6dfb9aefc86690f241f5470e80e986f2f5a02fc12861ba7aa3d7733b3f534 - Sigstore transparency entry: 1393055275
- Sigstore integration time:
-
Permalink:
smturtle2/market-wave@b288a595afd4b99efbb735fdc5973fb1adbd5f11 -
Branch / Tag:
refs/tags/v0.4.1 - Owner: https://github.com/smturtle2
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@b288a595afd4b99efbb735fdc5973fb1adbd5f11 -
Trigger Event:
push
-
Statement type: