Skip to main content

RegLabels

Project description

ValidMLInference

This repository hosts the code for the ValidMLInference package, implementing bias corrction methods described in Battaglia, Christensen, Hansen & Sacher (2024). The two core functions are:

ols_bca

This procedure first computes the standard OLS estimator on a design matrix (Xhat), the first column of which contains AI/ML-generated binary labels, and then applies an additive correction based on an estimate (fpr) of the false-positive rate computed externally. The method also adjusts the variance estimator with a finite-sample correction term to account for the uncertainty in the bias estimation.

Parameters
----------
Y : array_like, shape (n,)
    Response variable vector.
Xhat : array_like, shape (n, d)
    Design matrix, the first column of which contains the AI/ML-generated binary covariates.
fpr : float
    False positive rate of misclassification, used to correct the OLS estimates.
m : int or float
    Size of the external sample used to estimate the classifier's false-positive rate. Can be set to 'inf' when the false-positive rate is known exactly.

Returns
-------
b : ndarray, shape (d,)
    Bias-corrected regression coefficient estimates.
V : ndarray, shape (d, d)
    Adjusted variance-covariance matrix for the bias-corrected estimator.

one_step_unlabeled

This method jointly estimates the upstream (measurement) and downstream (regression) models using only the unlabeled likelihood. Leveraging JAX for automatic differentiation and optimization, it minimizes the negative log-likelihood to obtain the regression coefficients. The variance is then approximated via the inverse Hessian at the optimum.

Parameters
----------
Y : array_like, shape (n,)
    Response variable vector.
Xhat : array_like, shape (n, d)
    Design matrix constructed from AI/ML-generated regressors.
homoskedastic : bool, optional (default: False)
    If True, assumes a common error variance; otherwise, separate error variances are estimated.
distribution : allows to specify the distribution of error terms, optional. By default, it's Normal(0,1).
    A custom distribution can be passed down as any jax-compatible PDF function that takes inputs (x, loc, scale).

Returns
-------
b : ndarray, shape (d,)
    Estimated regression coefficients extracted from the optimized parameter vector.
V : ndarray, shape (d, d)
    Estimated variance-covariance matrix for the regression coefficients, computed as the inverse 
    of the Hessian of the objective function.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

validmlinference-0.0.9.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

validmlinference-0.0.9-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file validmlinference-0.0.9.tar.gz.

File metadata

  • Download URL: validmlinference-0.0.9.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for validmlinference-0.0.9.tar.gz
Algorithm Hash digest
SHA256 81fa664ebced1b8641096abc0f892ee600ebf0cf30781879a12d7f4750f97896
MD5 b6c85aebe911e0db529984d0ba938868
BLAKE2b-256 570f81a1bd9d708ec2ed7d56e2f0861419f738cbcd99ade958f3435fa3044468

See more details on using hashes here.

File details

Details for the file validmlinference-0.0.9-py3-none-any.whl.

File metadata

File hashes

Hashes for validmlinference-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 1a4d3ef50fdb36cd2ee28d587a93bebd779c226d8a5b47fb550304e182553d5e
MD5 7122754f7e3175012f6f893f0c0a2cc4
BLAKE2b-256 24348cfdf5dd9a122886927ef14c6dd6430f07716a8cb837b2f5dbdf9bafb5f8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page