Skip to main content

Measure-theoretic Generalized Method of Moments estimation; estimation via E_mu.

Project description

#+TITLE: emu-gmm
#+SUBTITLE: Measure-theoretic GMM; estimation via $\mathbb{E}_\mu$
#+AUTHOR: Ethan Ligon
#+OPTIONS: toc:nil num:nil

=emu-gmm= is a JAX-native framework for Generalized Method of Moments estimation. The framework is named for the operator at its centre: =emu= reads as $\mathbb{E}_\mu$, the expectation under a measure $\mu$. One operator interface --- implemented against empirical, analytical, or synthetic measures --- drives sample estimation, identification analysis, and simulation-based inference through a single computational pipeline.

The most important architectural commitment is that variance construction is orthogonal to integration: $\mathbb{E}_\mu[\psi]$ is a property of the measure; $V_\mu(\theta)$ is a property of a separate =CovarianceStrategy= that can be swapped (iid vs cluster-robust vs replicate-weight) without changing what is being integrated.

* Status

v2 (v0.2.0): the v1 pipeline plus Riemannian-manifold parameter geometry and the stratified / design-aware covariance ladder. The implemented menu --- everything importable and composable today --- is tabulated in =docs/design.org= under "Contract surface (implemented)", the single source of truth for status claims; the green gate is =make check= (ruff + black + mypy + the full pytest suite). The three measure paths (synthetic, analytical, empirical) all run end-to-end against the bundled multi-asset Euler example, and the empirical path additionally has a real-data acceptance test (=tests/test_estimator_realdata.py=). The architecture and theoretical scope are pinned in =docs/=.

* Quickstart

Bootstrap and try the worked example:

#+begin_src shell
make setup # poetry install + .venv/
poetry run python examples/run_euler.py # multi-asset Hansen-Singleton demo
#+end_src

The minimal estimation surface in code (synthetic-data variant):

#+begin_src python
import jax
from emu_gmm import estimate, SyntheticMeasure, SyntheticCovariance
from emu_gmm.examples.euler import (
EulerParams, euler_residual, euler_sampler_factory,
)

measure = SyntheticMeasure(
key=jax.random.PRNGKey(0),
n_sim=5000,
sampler=euler_sampler_factory(5000),
)

result = estimate(
model=euler_residual,
measure=measure,
covariance=SyntheticCovariance(),
theta_init=EulerParams(beta=0.9, gamma=1.0),
)

print(f"beta = {result.theta_hat.beta:.4f} (truth 0.96)")
print(f"gamma = {result.theta_hat.gamma:.4f} (truth 2.00)")
print(f"J-stat = {result.J_stat:.3f} dof={result.J_dof} p={result.J_pvalue:.3f}")
print(result.to_pandas()["Sigma_theta"])
#+end_src

The same =euler_residual= drives the analytical and empirical variants of the demo; only the =Measure= / =CovarianceStrategy= change. See =docs/api-sketch.org= Section 5 for the three side-by-side, and =examples/run_euler.py= for the runnable script.

* Documents

- =docs/howto.org= --- *start here.* Integration HOWTO: the architecture
at the level you need to wire an application to =estimate()= --- what
code you write (the moment function, the parameter container, the
measure/covariance choice) and what the framework hands back.
- =docs/design.org= --- architecture specification (four review rounds; stable). Its "Contract surface (implemented)" section is the single source of truth for the implemented menu.
- =docs/api-sketch.org= --- the v1 API surface, retained as the v1 design record; superseded on status by =design.org='s contract-surface section.
- =docs/implementation-plan.org= --- phased task list (Phases 1-7 complete, Phase 8 polish underway; the v2 roadmap is Section 13).
- =docs/mcar-asymptotics.org= --- companion theoretical note; consistency, asymptotic normality, PD properties of the pairwise-overlap estimator under MCAR.
- =docs/refs.bib= --- project-local bibliography (entries not in =~/bibtex/main.bib=).

* Setup

Requires Python >= 3.11 and [[https://python-poetry.org/][Poetry]].

#+begin_src shell
make setup
#+end_src

This runs =poetry install= and creates =.venv/= in the project root. Activate with =direnv allow= or =poetry shell=.

* Development

#+begin_src shell
make check # ruff + black + mypy + full pytest
make quick-check # same, skipping slow tests
make test # pytest only
#+end_src

To install pre-commit hooks (ruff + black on every commit):

#+begin_src shell
poetry run pre-commit install
#+end_src

* Layout

#+begin_example
docs/ design specs and theoretical notes
src/emu_gmm/
__init__.py public API re-exports
types.py protocols + EstimationResult / Diagnostics
estimator.py estimate() / build_estimator() entry points
measures/ SyntheticMeasure, AnalyticalMeasure, EmpiricalMeasure
covariance/ the CovarianceStrategy ladder (iid, clustered,
stratified / design-aware, sum, analytical, synthetic)
weighting.py Identity, Fixed, IteratedWeighting, ContinuouslyUpdated
regularization.py DiagonalTikhonov
penalty.py TikhonovPenalty
optimizer.py optimistix_lm, scipy_lm, linear_solver
manifolds/ parameter geometry (Euclidean, Positive, PSDFixedRank,
Product, ManifoldLeaf) + riemannian_lm
parameter_space.py ParameterSpace / on field-to-manifold declarations
inference/ j_test, k_statistic, k_confidence_set, bootstraps
numerics/ ridge_inverse
studies/ Monte Carlo driver (subpackage-only; not re-exported)
diagnostics.py build_diagnostics, log_to_stdout
examples/ shared example models (euler.py)
_internal/ axes, params, cholesky, labels (private)
tests/ test suite mirroring src/emu_gmm/
examples/ runnable demo scripts
Makefile build automation
pyproject.toml Poetry configuration
poetry.lock pinned dependencies (committed)
#+end_example

* Public API

Everything user-facing is re-exported at the package top level
(=from emu_gmm import ...=), with one deliberate exception: the Monte
Carlo =studies= subpackage is subpackage-only (import from
=emu_gmm.studies= directly). The implemented menu --- entry points,
measures, covariance strategies, weighting, regularization, penalty,
optimisers, manifold geometry, inference helpers, and result types ---
is tabulated, one line of semantics and a module pointer per row, in
=docs/design.org= under "Contract surface (implemented)". That table is
the single source of truth this README defers to;
=sorted(emu_gmm.__all__)= is the export list it tracks.

* Using emu-gmm correctly (and reporting problems)

=emu-gmm= is intended as a /spare set of correct interfaces/. Your only
modelling input is a per-observation residual =psi(x_i, theta) -> R^M=;
the package owns the moment expectation, the design-aware covariance
=V_X=, the criterion, the \(J\)-statistic, standard errors, and
\(p\)-values. Two rules keep estimation correct:

1. *Read results off the package, never recompute them by hand.* Take the
criterion, =J_stat=, =standard_errors=, and \(p\)-values from
=EstimationResult= / =k_statistic= / the inference helpers. A
hand-rolled criterion is how subtle scaling bugs get reintroduced ---
notably the textbook habit of scaling moments by \(\sqrt{N}\) and
weighting by \(V_X\), which is correct only when every moment has the
/same/ number of observations. =emu-gmm= keeps moments as
per-coordinate means and folds all the per-moment \(N_j\) bookkeeping
into \(V_X\) (the \(1/(N_j N_k)\) normalisation); see =docs/design.org=
"Scaling convention" and =docs/mcar-asymptotics.org=.

2. *Express per-moment observability only through the =mask=.* Different
moments may be observed for different numbers of units; the \((N, M)\)
=mask= on =EmpiricalMeasure= is the single place that lives. Do not
work around it by pre-aggregating or by assuming a common \(N\).

If an interface is missing a knob, or you believe it gives a wrong
answer: *file an issue against =emu-gmm=, do not reimplement the
statistic in your own project.* A local reimplementation forfeits the
"single correct implementation" guarantee for everyone downstream. The
fix belongs in the shared package.

* License

[[file:LICENSE.org][CC-BY-NC-SA-4.0]].

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

emu_gmm-0.3.1.tar.gz (197.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

emu_gmm-0.3.1-py3-none-any.whl (232.6 kB view details)

Uploaded Python 3

File details

Details for the file emu_gmm-0.3.1.tar.gz.

File metadata

  • Download URL: emu_gmm-0.3.1.tar.gz
  • Upload date:
  • Size: 197.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.12 Linux/6.6.99-09128-g14e87a8a9b71

File hashes

Hashes for emu_gmm-0.3.1.tar.gz
Algorithm Hash digest
SHA256 361c2b6272130b48e7d4100508963fc2aa401b31a50a4dba6759b6e131e4c4d5
MD5 90f248face2c3ea5e8b4a8b7a22b8cf8
BLAKE2b-256 35e0a557c77a1c0087d5c4f460bbd27575ff7e01a98d5656e6ae61dcbb9ed5f6

See more details on using hashes here.

File details

Details for the file emu_gmm-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: emu_gmm-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 232.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.12 Linux/6.6.99-09128-g14e87a8a9b71

File hashes

Hashes for emu_gmm-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c41d89926344f795836a28aba84030db9b8dfae011ba7bc05aac5a6de94a0627
MD5 f9dfb114af1fd96a495c5ac44c21ac50
BLAKE2b-256 a8d98dd9854d5afeff740aad370656f21702f3b6d680db5fb52d5d57991ed512

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page