A Protocol for data-generating processes; minimal interface for analog-estimation toolkits.
Project description
A minimal Python Protocol for data-generating processes (DGPs).
What this is
A Protocol (DataGeneratingProcess) with two members – data (a
frozen property returning the observed realization) and
draw(size=..., *, rng=...) (a method returning a fresh realization) –
plus a small set of composition primitives (TwoStageDGP, with_data)
and thin convenience wrappers (EmpiricalDGP, ParametricDGP) for
working with DGPs as first-class objects.
The package is not a library of working DGPs. Concrete DGPs live in
consumer packages – e.g.
ManifoldGMM ships its own
moment-side DGPs. The role of DGP_Protocol is to define the contract
that lets such consumers interoperate.
Conceptual lineage
The Protocol promotes the stand-in distribution from Manski's analog
estimation framework (Manski 1988, Analog Estimation Methods in
Econometrics) to a first-class Python object. In that framework, an
estimator is defined by a population functional plus a sample-based
stand-in for the population; DataGeneratingProcess is that stand-in.
Different stand-ins yield different analog estimators:
- The empirical distribution -> nonparametric plug-in estimators.
- A parametric family fitted to the data -> MLE-style estimators.
- A bootstrap distribution -> bootstrap inference.
- A null-imposed restriction -> constrained estimators.
Installation
pip install DGP_Protocol
The import path is PEP-8 lowercase:
from dgp_protocol import DataGeneratingProcess, EmpiricalDGP, TwoStageDGP
Minimal example
import numpy as np
from dgp_protocol import EmpiricalDGP
data = np.random.default_rng(0).standard_normal(size=(100, 3))
# The DGP owns its own RNG. Pass `seed` for reproducibility;
# `draw()` itself takes no `rng` argument.
dgp = EmpiricalDGP(observation=data, seed=1)
print(dgp.data.shape) # (100, 3) -- the frozen realization
print(dgp.draw().shape) # (100, 3) -- a fresh bootstrap resample
# Rebind to a different realization while keeping the distributional
# structure. The child gets an independent (spawned) Generator.
fresh = dgp.with_data(np.random.default_rng(2).standard_normal(size=(50, 3)))
print(fresh.data.shape) # (50, 3)
For more substantial examples – parametric DGPs, two-stage composition (hierarchical sampling), cluster-block bootstrap – see the test suite under tests/.
Design
The design is intentionally minimal: data + draw are the only
required members. Composition primitives (TwoStageDGP, with_data)
take DGPs and return DGPs without expanding the Protocol.
The design note that motivated this package lives in the sibling
ManifoldGMM repo at
docs/design/dgp.org – DGPProtocol was extracted from that
design conversation. See also AGENTS.md for the package's
scope discipline and the list of intentionally deferred features.
How to cite
If you use DGPProtocol in academic work, please cite it. The
repository's CITATION.cff is recognised by GitHub and provides
one-click citation export in APA, BibTeX, and other formats from the
repo's main page.
A BibTeX entry suitable for paper drafts:
@software{ligon_dgp_protocol_2026,
author = {Ligon, Ethan},
title = {DGP\_Protocol: A Protocol for data-generating processes},
year = {2026},
publisher = {GitHub},
url = {https://github.com/ligon/DGP_Protocol},
version = {0.1.0a0},
license = {BSD-3-Clause},
}
License
BSD 3-Clause (BSD-3-Clause). See the LICENSE file at the root of
this repository. In short: permissive use including commercial,
modification, and redistribution; preserve the copyright notice and
license text in redistributions; no use of the author's name to endorse
derived products.
Author
Ethan Ligon, UC Berkeley.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dgp_protocol-0.1.0a0.tar.gz.
File metadata
- Download URL: dgp_protocol-0.1.0a0.tar.gz
- Upload date:
- Size: 26.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
be16c902587df6df56f9a61efe4ab49850b65f3f3c66fd9ff40f7d693402ceb3
|
|
| MD5 |
69c8e9a769aa9dc62c24b1344df1b78f
|
|
| BLAKE2b-256 |
4dee2d4b668ea14de60c53e39b77335165dc51e92184c32cda2dc23a0bc12f6d
|
Provenance
The following attestation bundles were made for dgp_protocol-0.1.0a0.tar.gz:
Publisher:
publish.yml on ligon/DGP_Protocol
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dgp_protocol-0.1.0a0.tar.gz -
Subject digest:
be16c902587df6df56f9a61efe4ab49850b65f3f3c66fd9ff40f7d693402ceb3 - Sigstore transparency entry: 1639140754
- Sigstore integration time:
-
Permalink:
ligon/DGP_Protocol@d79497d8d76251b00f91044ff5475b7437cd066b -
Branch / Tag:
refs/tags/v0.1.0a0 - Owner: https://github.com/ligon
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d79497d8d76251b00f91044ff5475b7437cd066b -
Trigger Event:
push
-
Statement type:
File details
Details for the file dgp_protocol-0.1.0a0-py3-none-any.whl.
File metadata
- Download URL: dgp_protocol-0.1.0a0-py3-none-any.whl
- Upload date:
- Size: 33.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
263600c6f7c1c069ef05f8a15ceb2c79c575ba6d330a8751ce1af03f649ec165
|
|
| MD5 |
f01e302e8d349606660e9d3beeeae5ef
|
|
| BLAKE2b-256 |
5509f103af7e7cde4b8d3660abe4019ad647e97b4b18b68be9333d39550f20c4
|
Provenance
The following attestation bundles were made for dgp_protocol-0.1.0a0-py3-none-any.whl:
Publisher:
publish.yml on ligon/DGP_Protocol
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dgp_protocol-0.1.0a0-py3-none-any.whl -
Subject digest:
263600c6f7c1c069ef05f8a15ceb2c79c575ba6d330a8751ce1af03f649ec165 - Sigstore transparency entry: 1639140879
- Sigstore integration time:
-
Permalink:
ligon/DGP_Protocol@d79497d8d76251b00f91044ff5475b7437cd066b -
Branch / Tag:
refs/tags/v0.1.0a0 - Owner: https://github.com/ligon
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@d79497d8d76251b00f91044ff5475b7437cd066b -
Trigger Event:
push
-
Statement type: