Skip to main content

Configurable causal DAG simulator for synthetic mixed-type data and CI test benchmarks

Project description

dagsampler

PyPI version Python versions License: MIT Documentation

Configurable causal DAG simulator for synthetic mixed-type data and CI test benchmarks.

Documentation · Changelog

What it provides

  • CausalDataGenerator class for configurable simulation
  • Support for custom and random DAGs
  • Mixed continuous/binary/categorical nodes (configurable categorical cardinality)
  • Structural forms: linear, polynomial, interaction, sigmoid, cos, sin, stratum_means
  • Optional element-wise post_transform (tanh, sin, cos, exp_neg_abs, sqrt_abs, relu, sign)
  • Cross-type mechanisms:
    • continuous -> categorical (categorical_model.name = "threshold")
    • categorical -> continuous (functional_form.name = "stratum_means", including mixed-parent cases with metric_weights)
  • Noise models:
    • additive (gaussian, student_t, gamma, exponential, laplace, cauchy, uniform)
    • multiplicative (gaussian, student_t, gamma, exponential)
    • heteroskedastic (abs_first_parent, abs_parent_plus_const, mean_abs_plus_const)
  • Random weight sampling controls (including exclusion band around zero)
  • force_uniform_marginals for balanced exogenous binary / categorical draws
  • Template helpers (chain_config, fork_config, collider_config, independence_config)
  • Reproducibility via seed_structure and seed_data (or single seed)
  • Optional d-separation CI oracle output (store_ci_oracle=true)

Installation

From PyPI:

pip install dagsampler

Or with uv:

uv venv
source .venv/bin/activate
uv pip install dagsampler

From GitHub (latest main):

uv pip install "dagsampler @ git+https://github.com/averinpa/dagsampler.git"

Random weights away from zero

To guarantee a minimum signal strength on every edge — so randomly sampled weights don't end up effectively muting a parent — configure:

{
  "simulation_params": {
    "random_weight_low": -1.5,
    "random_weight_high": 1.5,
    "random_weight_min_abs": 0.1
  }
}

This samples random structural weights from:

  • [-1.5, -0.1] U [0.1, 1.5]

By default, categorical parents are not allowed with metric functional forms (linear, polynomial, interaction). Set:

  • "categorical_parent_metric_form_policy": "stratum_means" to auto-redirect those cases to stratum_means.

Quick start (Python API)

from dagsampler import CausalDataGenerator

config = {
    "simulation_params": {"n_samples": 200, "seed": 42},
    "graph_params": {
        "type": "custom",
        "nodes": ["X", "Y", "Z1"],
        "edges": [["X", "Z1"], ["Y", "Z1"]],
    },
}

result = CausalDataGenerator(config).simulate()
data = result["data"]
dag = result["dag"]
params = result["parametrization"]

CLI

The package exposes dagsampler-generate.

dagsampler-generate \
  --config config.json \
  --output dataset.csv \
  --params-out params.json \
  --edges-out edges.json

config.json must contain the same structure used by CausalDataGenerator.

For heteroskedastic noise, use noise_model.func from:

  • abs_first_parent
  • abs_parent_plus_const
  • mean_abs_plus_const

Development

uv pip install -e ".[dev]"
pytest -q

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dagsampler-0.1.0.tar.gz (24.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dagsampler-0.1.0-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file dagsampler-0.1.0.tar.gz.

File metadata

  • Download URL: dagsampler-0.1.0.tar.gz
  • Upload date:
  • Size: 24.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for dagsampler-0.1.0.tar.gz
Algorithm Hash digest
SHA256 23e0c1bead42fa035a08b86951172ece058549ba670f1fdd675482713244f4f0
MD5 de9692d826a07d7b0aec0855fbb1c5c3
BLAKE2b-256 c20bcc506a54a607f3d55464b10509ca4303c418b876acdf3f19f158ee5f9398

See more details on using hashes here.

File details

Details for the file dagsampler-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: dagsampler-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for dagsampler-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b4635b777cdf947674320e9f522ba914880c24bfd11c692b9b08ceb359929871
MD5 70aed151d45f3a0c70d2ab49c4eb4731
BLAKE2b-256 4a030f730b1336dc8b2ab4d19ee9ec611d16141d1f70e1c9daf55f9434501147

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page