Skip to main content

RegLabels

Project description

ValidMLInference

This repository hosts the code for the ValidMLInference package, implementing bias corrction methods described in Battaglia, Christensen, Hansen & Sacher (2024). The two core functions are:

ols_bca

This procedure first computes the standard OLS estimator on a design matrix (Xhat), the first column of which contains AI/ML-generated binary labels, and then applies an additive correction based on an estimate (fpr) of the false-positive rate computed externally. The method also adjusts the variance estimator with a finite-sample correction term to account for the uncertainty in the bias estimation.

Parameters
----------
Y : array_like, shape (n,)
    Response variable vector.
Xhat : array_like, shape (n, d)
    Design matrix, the first column of which contains the AI/ML-generated binary covariates.
fpr : float
    False positive rate of misclassification, used to correct the OLS estimates.
m : int or float
    Size of the external sample used to estimate the classifier's false-positive rate. Can be set to 'inf' when the false-positive rate is known exactly.

Returns
-------
b : ndarray, shape (d,)
    Bias-corrected regression coefficient estimates.
V : ndarray, shape (d, d)
    Adjusted variance-covariance matrix for the bias-corrected estimator.

one_step_unlabeled

This method jointly estimates the upstream (measurement) and downstream (regression) models using only the unlabeled likelihood. Leveraging JAX for automatic differentiation and optimization, it minimizes the negative log-likelihood to obtain the regression coefficients. The variance is then approximated via the inverse Hessian at the optimum.

Parameters
----------
Y : array_like, shape (n,)
    Response variable vector.
Xhat : array_like, shape (n, d)
    Design matrix constructed from AI/ML-generated regressors.
homoskedastic : bool, optional (default: False)
    If True, assumes a common error variance; otherwise, separate error variances are estimated.
distribution : allows to specify the distribution of error terms, optional. By default, it's Normal(0,1).
    A custom distribution can be passed down as any jax-compatible PDF function that takes inputs (x, loc, scale).

Returns
-------
b : ndarray, shape (d,)
    Estimated regression coefficients extracted from the optimized parameter vector.
V : ndarray, shape (d, d)
    Estimated variance-covariance matrix for the regression coefficients, computed as the inverse 
    of the Hessian of the objective function.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

validmlinference-0.0.7.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

validmlinference-0.0.7-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file validmlinference-0.0.7.tar.gz.

File metadata

  • Download URL: validmlinference-0.0.7.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for validmlinference-0.0.7.tar.gz
Algorithm Hash digest
SHA256 d53d6a32b99b5d695af49db097c63be4a67e929cb07faa0cab8c52aefcdbfa2a
MD5 796641d242c96293e2395b3d0f631c1f
BLAKE2b-256 b99e3b516da8226b303c2001788d5950e331df52387ad5058ec2e6aa9addc350

See more details on using hashes here.

File details

Details for the file validmlinference-0.0.7-py3-none-any.whl.

File metadata

File hashes

Hashes for validmlinference-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 459f2ae82457c0acd0186fea8d528710fcbb5b1c954a26cbc719c32b15e2e071
MD5 12e82650e14fe58abbe2a93d8b5dbc36
BLAKE2b-256 ca051158818aee382ea40e27303bbdce5ff97c4d74eb5fd5d37c70e81409c434

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page