Skip to main content

Synthetic tabular data generator for causal modeling

Project description

dagzoo

dagzoo generates reproducible synthetic tabular corpora from sampled causal structure. The stable adoption layer is a small set of named recipe packs plus stable artifact contracts; repo-internal authoring under configs/ remains available for advanced work, but it is not the primary public entrypoint.

Start

Use the packaged CLI when you want the public workflow without a repo checkout:

uv tool install dagzoo
dagzoo recipe list
dagzoo generate --config recipe:default-baseline --num-datasets 25 --out data/default_baseline
dagzoo generate --config recipe:tabpfn-v1-prior-approx --num-datasets 25 --out data/tabpfn_prior
dagzoo filter --in data/default_baseline --out data/default_baseline_filter

Use a repo checkout when you want to edit configs, run docs tooling, or work on the codebase:

./scripts/dev bootstrap
source .venv/bin/activate
./scripts/dev verify quick

For in-process training loops, use the same recipe references through the PyTorch bridge:

from dagzoo import build_dataloader

loader = build_dataloader(
    "recipe:default-baseline",
    num_datasets=10,
    seed=7,
    device="cpu",
)
sample = next(iter(loader))
print(sample["X_train"].shape)

Public Surface

  • dagzoo recipe list shows the curated public catalog.
  • dagzoo generate --config recipe:<name> is the primary reproducible CLI path.
  • build_dataloader("recipe:<name>", ...) is the programmatic equivalent.
  • recipes/*.yaml are the published recipe sources behind those stable names.
  • configs/*.yaml remain useful for advanced/internal authoring, but they move faster than the named recipe surface.

Docs

Community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dagzoo-0.14.4.tar.gz (568.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dagzoo-0.14.4-py3-none-any.whl (245.4 kB view details)

Uploaded Python 3

File details

Details for the file dagzoo-0.14.4.tar.gz.

File metadata

  • Download URL: dagzoo-0.14.4.tar.gz
  • Upload date:
  • Size: 568.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dagzoo-0.14.4.tar.gz
Algorithm Hash digest
SHA256 58f901321b77137ac1994ebca8110ff974e1d7c311728a8746a7630288a0cae3
MD5 bd0b11073b2c6d78b9ac358f3f29d351
BLAKE2b-256 67996f94bcbc657eb95f3bc2f4f66b53e02eb9a845615c86e999a2f7efee3021

See more details on using hashes here.

Provenance

The following attestation bundles were made for dagzoo-0.14.4.tar.gz:

Publisher: package.yml on bensonlee5/dagzoo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dagzoo-0.14.4-py3-none-any.whl.

File metadata

  • Download URL: dagzoo-0.14.4-py3-none-any.whl
  • Upload date:
  • Size: 245.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dagzoo-0.14.4-py3-none-any.whl
Algorithm Hash digest
SHA256 28fa2b816c2bee1c4a2dbb5dba69470f0a3c8da2286b43bc26f56ce13000ef3d
MD5 c9d59c1b8c682d03d939a82eb83dc4b2
BLAKE2b-256 ad2d6875eb69e72feed1819979b10bc4bdad8da23d35abc7ee01d26948f927ac

See more details on using hashes here.

Provenance

The following attestation bundles were made for dagzoo-0.14.4-py3-none-any.whl:

Publisher: package.yml on bensonlee5/dagzoo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page