Skip to main content

Package for Robust Statistics

Project description

delphi.ai package

Install via pip: pip install delphi.ai

This library holds a collection of algorithms that can be used debias models that have been defected due to truncation, or missing data. A few projects using the library can found in:

  • Code for Efficient Truncated Linear Regression with Unknown Noise Variance <https://github.com/pstefanou12/Truncated-Regression-With-Unknown-Noise-Variance-NeurIPS-2021>_

We demonstrate how to use the library in a set of walkthroughs and our API reference. Functionality provided by the library includes:

For best results using the package, the data should have mean 0 and variance 1.

Before running PSGD, the library will check that all of the required arguments are provided for runnning the procedure with an internal function. After this, all other hyperparameters can be provided by the user, or their defaults values will be used. The current default hyperparameters can be seen by looking at the delphi.utils.defaults.py file.

For logging experiment information, we use MadryLab's cox <https://github.com/MadryLab/cox>_. More information and tutorials on how to use the logging framework, check out the link.

Contents:

  • distributions <#distributions>__: distributions module includes algorithms for learning from censored (known truncation) and truncated (unknown truncation; unsupervised learning) distributions

    • CensoredNormal <#CensoredNormal>__
    • CensoredMultivariateNormal <#CensoredMultivariateNormal>__
    • TruncatedNormal <#TruncatedNormal>__
    • TruncatedMultivariateNormal <#TruncatedMultivariateNormal>__
    • TruncatedBernoulli <#TruncatedBernoulli>__
  • stats <#stats>__ : stats module includes models for regression and classification from truncated samples

    • TruncatedLinearRegression <#TruncatedLinearRegression>__
    • TruncatedLassoRegression <#TruncatedLassoRegression>__
    • TruncatedRidgeRegression <#TruncatedRidgeRegression>__
    • TruncatedElasticNetRegression <#TruncatedElasticNetRegression>__
    • TruncatedLogisticRegression <#TruncatedLogisticRegression>__
    • TruncatedProbitRegression <#TruncatedProbitRegression>__

distributions

CensoredNormal:

CensoredNormal learns censored normal distributions, by maximizing the truncated log likelihood. The algorithm that we use for this procedure is described in the following paper Efficient Statistics in High Dimensions from Truncated Samples <https://arxiv.org/abs/1809.03986>_.

When evaluating censored normal distributions, the user needs three things; an oracle, a Callable that indicates whether a sample falls within the truncation set, the model's alpha, survival probability, and the CensoredNormal module. The CensoredNormal module accepts a parameters object that the user can define for running the PSGD procedure.

Parameters:

  • args (delphi.utils.Parameters): parameters object that holds hyperparameters for experiment. Possible hyperparameters include:

    • phi (Callable)): required argument; callable class that receives num_samples by 1 input torch.Tensor, and returns a num_samples by 1 outputs a num_samples by 1 Tensor with (0, 1) representing membership in S or not.
    • alpha (float): required argument; survivial probability for truncated regression
    • variance (float): provide distribution's variance, if the distribution's variance is given, the mean is exclusively calculated
    • epochs (int): maximum number of times to iterate over dataset
    • trials (int): maximum number of trials to perform PSGD; after trials, model with smallest loss on the dataset is returned
    • val (float): percentage of dataset to use for validation set; default .2
    • lr (float): initial learning rate to use for regression weights; default 1e-1
    • step_lr (int): number of gradient steps to take before adjusting learning rate by value step_lr_gamma; default 100
    • step_lr_gamma (float): amount to adjust learning rate, every step_lr steps new_lr = curr_lr * step_lr_gamma
    • custom_lr_multiplier (str): cosine or cyclic for cosine annealing learning rate scheduling or cyclic learning rate scheduling; default None
    • momentum (float): momentum; default 0.0
    • adam (bool): use adam adaptive learning rate optimizer; default False
    • eps (float): epsilon denominator for gradients (ie. to prevent divide by zero calculations); default 1e-5
    • r (float): initial projection set radius; default 1.0
    • rate (float): at the end of each trial, the projection set radius is increased at rate rate; default 1.5
    • batch_size (int): the number of samples to use for each gradient step; default 50
    • tol (float): if using early stopping, threshold for when to stop; default 1e-3
    • workers (int): number of workers to use for procedure; default 1
    • num_samples (int): number of samples to sample from distribution in gradient for each sample in batch (ie. if batch size is 10, and num_samples is 100, the each gradient step with sample 100 * 10 samples from a gaussian distribution); default 50
    • early_stopping (bool): whether to check loss for convergence; compares the best avg validation loss at the end of an epoch, with current avg epoch loss estimate, if :math:best_loss - curr_loss < tol for n_iter_no_change, then procedure terminates; default False
    • n_iter_no_change (int): number of iterations to check for change before declaring convergence; default 5
    • verbose (bool): whether to print a verbose output with loss logs, etc.; default False
  • store (cox.store.Store): logging object to keep track distribution's train and validation losses

Attributes:


* ``loc_`` (torch.Tensor): distribution's estimated mean 
* ``variance_`` (torch.Tensor): distribution's estimated variance 

In the following code block, here, we show an example of how to use the censored normal distribution module: 
   
.. code-block:: python

  from delphi.distributions.censored_normal import CensoredNormal
  from delphi import oracle
  from delphi.utils.helpers import Parameters
  from cox.store import Store

  OUT_DIR = 'PATH_TO_EXPERIMENT_LOGGING_DIRECTORY'
  store = Store(OUT_DIR)

  # left truncate 0 (ie. S = {x >= 0 for all x in S})
  phi = oracle.Left_Distribution(0.0)
  # pass algorithm parameters in through Parameters object
  train_kwargs = Parameters({'phi': phi, 
                              'alpha': alpha})
  # define censored normal distribution object
  censored = CensoredNormal(train_kwargs, store=store)
  # fit to dataset
  censored.fit(S)
  # close store 
  store.close()

CensoredMultivariateNormal:
--------------------------
``CensoredMultivariateNormal`` learns censored multivariate normal distributions, by maximizing the truncated log likelihood.
The algorithm that we use for this procedure is described in the following
paper `Efficient Statistics in High Dimensions from Truncated Samples <https://arxiv.org/abs/1809.03986>`_.

When evaluating censored multivariate normal distributions, the user needs three things; an oracle, a Callable that 
indicates whether a sample falls within the truncation set, the model's ``alpha``, survival probability, and the ``CensoredMultivariateNormal`` module. The ``CensoredMultivariateNormal`` module accepts 
a parameters object that the user can define for running the PSGD procedure.

Parameters:
-----------

* ``args`` (delphi.utils.Parameters): parameters object that holds hyperparameters for experiment. Possible hyperparameters include:

  * ``phi`` (Callable): required argument; callable class that receives num_samples by 1 input ``torch.Tensor``, and returns a num_samples by 1 outputs a num_samples by 1 ``Tensor`` with ``(0, 1)`` representing membership in ``S`` or not.
  * ``alpha`` (float): required argument; survivial probability for truncated regression
  * ``covariance_matrix`` (torch.Tensor): provide distribution's covariance_matrix, if the distribution's covariance_matrix is given, the mean vector is exclusively calculated 
  * ``epochs`` (int): maximum number of times to iterate over dataset
  * ``trials`` (int): maximum number of trials to perform PSGD; after trials, model with smallest loss on the dataset is returned
  * ``val`` (float): percentage of dataset to use for validation set; default .2
  * ``lr`` (float): initial learning rate to use for regression weights; default 1e-1
  * ``step_lr`` (int): number of gradient steps to take before adjusting learning rate by value ``step_lr_gamma``; default 100
  * ``step_lr_gamma`` (float): amount to adjust learning rate, every ``step_lr`` steps ``new_lr = curr_lr * step_lr_gamma``
  * ``custom_lr_multiplier`` (str): `cosine` or `cyclic` for cosine annealing learning rate scheduling or cyclic learning rate scheduling; default None
  * ``momentum`` (float): momentum; default 0.0 
  * ``adam`` (bool): use adam adaptive learning rate optimizer; default False
  * ``eps`` (float): epsilon denominator for gradients (ie. to prevent divide by zero calculations); default 1e-5
  * ``r`` (float): initial projection set radius; default 1.0
  * ``rate`` (float): at the end of each trial, the projection set radius is increased at rate `rate`; default 1.5
  * ``batch_size`` (int): the number of samples to use for each gradient step; default 50
  * ``tol`` (float): if using early stopping, threshold for when to stop; default 1e-3
  * ``workers`` (int): number of workers to use for procedure; default 1
  * ``num_samples`` (int): number of samples to sample from distribution in gradient for each sample in batch (ie. if batch size is 10, and num_samples is 100, the each gradient step with sample 100 * 10 samples from a gaussian distribution); default 50
  * ``early_stopping`` (bool): whether to check loss for convergence; compares the best avg validation loss at the end of an epoch, with current avg epoch loss estimate, if :math:`best_loss - curr_loss < tol` for `n_iter_no_change`, then procedure terminates; default False
  * ``n_iter_no_change`` (int): number of iterations to check for change before declaring convergence; default 5
  * ``verbose`` (bool): whether to print a verbose output with loss logs, etc.; default False 

* ``store`` (cox.store.Store): logging object to keep track distribution's train and validation losses   

Attributes:
  • loc_ (torch.Tensor): distribution's estimated mean
  • covariance_matrix_ (torch.Tensor): distribution's estimated covariance matrix

In the following code block, here, we show an example of how to use the censored multivariate normal distribution module:

.. code-block:: python

from torch import Tensor from delphi.distributions.censored_multivariate_normal import CensoredMultivariateNormal from delphi import oracle from delphi.utils.helpers import Parameters from cox.store import Store

OUT_DIR = 'PATH_TO_EXPERIMENT_LOGGING_DIRECTORY' store = Store(OUT_DIR)

left truncate 0 (ie. S = {x >= 0 for all x in S})

phi = oracle.Left_Distribution(Tensor([0.0, 0.0]))

pass algorithm parameters in through Parameters object

train_kwargs = Parameters({'phi': phi, 'alpha': alpha})

define censored multivariate normal distribution object

censored = CensoredMultivariateNormal(train_kwargs, store=store)

fit to dataset

censored.fit(S)

close store

store.close()

TruncatedNormal:

TruncatedNormal learns truncated normal distributions, with unknown truncation, by maximizing the truncated log likelihood. The algorithm that we use for this procedure is described in the following paper Efficient Truncated Statistics with Unknown Truncation <https://arxiv.org/abs/1908.01034>_.

When evaluating truncated normal distributions, the user needs to import the TruncatedNormal module. The TruncatedNormal module accepts a parameters object that the user can define for running the PSGD procedure. When debiasing truncated normal distributions, we don't require a membership oracle, as it is unknown. However, after running our procedure, we are able to provide an approximation of what the truncation set is. Since the user inputs a membership oracle in the args object, when the truncation set is known, we add the learned membership oracle to the args object as well.

NOTE: when learning truncation sets, the user can not pass in a Parameters object directly into the TruncatedNormal object, because they will not be able to access the Parameters object afterwards.

Parameters:

  • args (delphi.utils.Parameters): parameters object that holds hyperparameters for experiment. Possible hyperparameters include:

    • alpha (float): required argument; survivial probability for truncated regression
    • covariance_matrix (torch.Tensor): provide distribution's covariance_matrix, if the distribution's covariance_matrix is given, the mean vector is exclusively calculated
    • epochs (int): maximum number of times to iterate over dataset
    • trials (int): maximum number of trials to perform PSGD; after trials, model with smallest loss on the dataset is returned
    • val (float): percentage of dataset to use for validation set; default .2
    • lr (float): initial learning rate to use for regression weights; default 1e-1
    • step_lr (int): number of gradient steps to take before adjusting learning rate by value step_lr_gamma; default 100
    • step_lr_gamma (float): amount to adjust learning rate, every step_lr steps new_lr = curr_lr * step_lr_gamma
    • custom_lr_multiplier (str): cosine or cyclic for cosine annealing learning rate scheduling or cyclic learning rate scheduling; default None
    • momentum (float): momentum; default 0.0
    • adam (bool): use adam adaptive learning rate optimizer; default False
    • eps (float): epsilon denominator for gradients (ie. to prevent divide by zero calculations); default 1e-5
    • r (float): initial projection set radius; default 1.0
    • rate (float): at the end of each trial, the projection set radius is increased at rate rate; default 1.5
    • batch_size (int): the number of samples to use for each gradient step; default 50
    • tol (float): if using early stopping, threshold for when to stop; default 1e-3
    • workers (int): number of workers to use for procedure; default 1
    • num_samples (int): number of samples to sample from distribution in gradient for each sample in batch (ie. if batch size is 10, and num_samples is 100, the each gradient step with sample 100 * 10 samples from a gaussian distribution); default 50
    • early_stopping (bool): whether to check loss for convergence; compares the best avg validation loss at the end of an epoch, with current avg epoch loss estimate, if :math:best_loss - curr_loss < tol for n_iter_no_change, then procedure terminates; default False
    • n_iter_no_change (int): number of iterations to check for change before declaring convergence; default 5
    • verbose (bool): whether to print a verbose output with loss logs, etc.; default False
    • d (int): degree of expansion to use for Hermite polynomial when learning truncation set; default 100
  • store (cox.store.Store): logging object to keep track distribution's train and validation losses

Attributes:


* ``loc_`` (torch.Tensor): distribution's estimated mean 
* ``variance_`` (torch.Tensor): distribution's estimated variance 

In the following code block, here, we show an example of how to fit the truncated normal distribution module: 
   
.. code-block:: python

  from delphi.distributions.truncated_normal import TruncatedNormal
  from delphi import oracle
  from delphi.utils.helpers import Parameters
  from cox.store import Store

  OUT_DIR = 'PATH_TO_EXPERIMENT_LOGGING_DIRECTORY'
  store = Store(OUT_DIR)

  # left truncate 0 (ie. S = {x >= 0 for all x in S})
  phi = oracle.Left_Distribution(0.0)
  # pass algorithm parameters in through Parameters object
  train_kwargs = Parameters({'phi': phi, 
                              'alpha': alpha, 
                              'd': 100})
  # define truncated normal distribution object
  truncated = TruncatedNormal(train_kwargs, store=store)
  # fit to dataset
  truncated.fit(S)
  # close store 
  store.close()

After fitting the distribution, we now have a membership oracle that we learned through a hermite polynomial. In the following code block, 
we show an example of how use the membership oracle: 

.. code-block:: python

  import torch as ch
  from torch.distributions.multivariate_normal import MultivariateNormal 

  # generate samples from a standard multivariate normal distribution
  M = MultivariateNormal(ch.zeros(1,), ch.eye(1))
  samples = M.rsample([1000,])
  # filter samples with learning membership oracle
  filtered = train_kwargs.phi(samples)

TruncatedMultivariateNormal:
--------------------------
``TruncatedMultivariateNormal`` learns truncated multivariate normal distributions, with unknown truncation, by maximizing the truncated log likelihood.
The algorithm that we use for this procedure is described in the following
paper `Efficient Truncated Statistics with Unknown Truncation <https://arxiv.org/abs/1908.01034>`_.

When evaluating truncated multivariate normal distributions, the user needs to ``import`` the ``TruncatedMultivariateNormal`` module. The ``TruncatedMultivariateNormal`` module accepts 
a parameters object that the user can define for running the PSGD procedure. When *debiasing* truncated normal distributions, we don't require a membership 
oracle, as it is unknown. However, after running our procedure, we are able to provide an approximation of what the truncation set is. Since the user 
inputs a membership oracle in the ``args`` object, when the truncation set is known, we add the learned membership oracle to the ``args`` object as well.


**NOTE:** when learning truncation sets, the user can not pass in a ``Parameters`` object directly into the ``TruncatedMultivariateNormal`` object, because they will not 
be able to access the ``Parameters`` object afterwards.

Parameters:
-----------

* ``args`` (delphi.utils.Parameters): parameters object that holds hyperparameters for experiment. Possible hyperparameters include:

  * ``phi`` (Callable): required argument; callable class that receives num_samples by 1 input ``torch.Tensor``, and returns a num_samples by 1 outputs a num_samples by 1 ``Tensor`` with ``(0, 1)`` representing membership in ``S`` or not.
  * ``alpha`` (float): required argument; survivial probability for truncated regression
  * ``variance`` (float): provide distribution's variance, if the distribution's variance is given, the mean is exclusively calculated 
  * ``epochs`` (int): maximum number of times to iterate over dataset
  * ``trials`` (int): maximum number of trials to perform PSGD; after trials, model with smallest loss on the dataset is returned
  * ``val`` (float): percentage of dataset to use for validation set; default .2
  * ``lr`` (float): initial learning rate to use for regression weights; default 1e-1
  * ``step_lr`` (int): number of gradient steps to take before adjusting learning rate by value ``step_lr_gamma``; default 100
  * ``step_lr_gamma`` (float): amount to adjust learning rate, every ``step_lr`` steps ``new_lr = curr_lr * step_lr_gamma``
  * ``custom_lr_multiplier`` (str): `cosine` or `cyclic` for cosine annealing learning rate scheduling or cyclic learning rate scheduling; default None
  * ``momentum`` (float): momentum; default 0.0 
  * ``adam`` (bool): use adam adaptive learning rate optimizer; default False
  * ``eps`` (float): epsilon denominator for gradients (ie. to prevent divide by zero calculations); default 1e-5
  * ``r`` (float): initial projection set radius; default 1.0
  * ``rate`` (float): at the end of each trial, the projection set radius is increased at rate `rate`; default 1.5
  * ``batch_size`` (int): the number of samples to use for each gradient step; default 50
  * ``tol`` (float): if using early stopping, threshold for when to stop; default 1e-3
  * ``workers`` (int): number of workers to use for procedure; default 1
  * ``num_samples`` (int): number of samples to sample from distribution in gradient for each sample in batch (ie. if batch size is 10, and num_samples is 100, the each gradient step with sample 100 * 10 samples from a gaussian distribution); default 50
  * ``early_stopping`` (bool): whether to check loss for convergence; compares the best avg validation loss at the end of an epoch, with current avg epoch loss estimate, if :math:`best_loss - curr_loss < tol` for `n_iter_no_change`, then procedure terminates; default False
  * ``n_iter_no_change`` (int): number of iterations to check for change before declaring convergence; default 5
  * ``verbose`` (bool): whether to print a verbose output with loss logs, etc.; default False 
  * ``d`` (int): degree of expansion to use for Hermite polynomial when learning truncation set; default 100

* ``store`` (cox.store.Store): logging object to keep track distribution's train and validation losses   

Attributes:
  • loc_ (torch.Tensor): distribution's estimated mean
  • covariance_matrix_ (torch.Tensor): distribution's estimated covariance matrix

In the following code block, here, we show an example of how to use the truncated multivariate normal distribution module:

.. code-block:: python

from torch import Tensor from delphi.distributions.truncated_multivariate_normal import TruncatedMultivariateNormal from delphi.utils.helpers import Parameters from delphi import oracle from cox.store import Store

OUT_DIR = 'PATH_TO_EXPERIMENT_LOGGING_DIRECTORY' store = Store(OUT_DIR)

left truncate 0 (ie. S = {x >= 0 for all x in S})

phi = oracle.Left_Distribution(Tensor([0.0, 0.0]))

pass algorithm parameters in through Parameters object

train_kwargs = Parameters({'phi': phi, 'alpha': alpha, 'd': 100})

define truncated multivariate normal distribution object

truncated = TruncatedMultivariateNormal(train_kwargs, store=store)

fit to dataset

truncated.fit(S)

close store

store.close()

After fitting the distribution, we now have a membership oracle that we learned through a hermite polynomial. In the following code block, we show an example of how use the membership oracle:

.. code-block:: python

import torch as ch from torch.distributions.multivariate_normal import MultivariateNormal

generate samples from a standard multivariate normal distribution

M = MultivariateNormal(ch.zeros(2,), ch.eye(2)) samples = M.rsample([1000,])

filter samples with learning membership oracle

filtered = train_kwargs.phi(samples)

TruncatedBernoullli:

TruncatedBooleanProduct learns truncated boolean product distributions, by maximizing the truncated log likelihood. The algorithm that we use for this procedure is described in the following paper Efficient Parameter Estimation of Truncated Boolean Product Distributions <https://arxiv.org/abs/2007.02392>_.

When evaluating truncated multivariate normal distributions, the user needs to import the TruncatedBernoulli module. The TruncatedBernoulli module accepts a parameters object that the user can define for running the PSGD procedure.

Parameters:

  • args (delphi.utils.Parameters): parameters object that holds hyperparameters for experiment. Possible hyperparameters include:

    • phi (Callable): required argument; callable class that receives num_samples by 1 input torch.Tensor, and returns a num_samples by 1 outputs a num_samples by 1 Tensor with (0, 1) representing membership in S or not.
    • alpha (float): required argument; survivial probability for truncated regression
    • epochs (int): maximum number of times to iterate over dataset
    • trials (int): maximum number of trials to perform PSGD; after trials, model with smallest loss on the dataset is returned
    • val (float): percentage of dataset to use for validation set; default .2
    • lr (float): initial learning rate to use for regression weights; default 1e-1
    • step_lr (int): number of gradient steps to take before adjusting learning rate by value step_lr_gamma; default 100
    • step_lr_gamma (float): amount to adjust learning rate, every step_lr steps new_lr = curr_lr * step_lr_gamma
    • custom_lr_multiplier (str): cosine or cyclic for cosine annealing learning rate scheduling or cyclic learning rate scheduling; default None
    • momentum (float): momentum; default 0.0
    • adam (bool): use adam adaptive learning rate optimizer; default False
    • eps (float): epsilon denominator for gradients (ie. to prevent divide by zero calculations); default 1e-5
    • r (float): initial projection set radius; default 1.0
    • rate (float): at the end of each trial, the projection set radius is increased at rate rate; default 1.5
    • batch_size (int): the number of samples to use for each gradient step; default 50
    • tol (float): if using early stopping, threshold for when to stop; default 1e-3
    • workers (int): number of workers to use for procedure; default 1
    • num_samples (int): number of samples to sample from distribution in gradient for each sample in batch (ie. if batch size is 10, and num_samples is 100, the each gradient step with sample 100 * 10 samples from a gaussian distribution); default 50
    • early_stopping (bool): whether to check loss for convergence; compares the best avg validation loss at the end of an epoch, with current avg epoch loss estimate, if :math:best_loss - curr_loss < tol for n_iter_no_change, then procedure terminates; default False
    • n_iter_no_change (int): number of iterations to check for change before declaring convergence; default 5
    • verbose (bool): whether to print a verbose output with loss logs, etc.; default False
  • store (cox.store.Store): logging object to keep track distribution's train and validation losses

Attributes:


* ``probs_`` (torch.Tensor): distribution's d-dimensional probability vector 
* ``logits_`` (torch.Tensor): distribution's d-dimensional logits vector (log probabilities) 

In the following code block, here, we show an example of how to use the truncated multivariate normal distribution module: 
   
.. code-block:: python

  from torch import Tensor
  from delphi.distributions.truncated_boolean_product import TruncatedBernoulli
  from delphi.utils.helpers import Parameters
  from delphi import oracle
  from cox.store import Store

  OUT_DIR = 'PATH_TO_EXPERIMENT_LOGGING_DIRECTORY'
  store = Store(OUT_DIR)

  # sum floor truncate at 0 (ie. S = {x.sum() >= 50 for all x in S})
  phi = oracle.Sum_Floor(50)
  # pass algorithm parameters in through Parameters object
  train_kwargs = Parameters({'phi': phi, 
                              'alpha': alpha})
  # define truncated bernoulli distribution object
  trunc_bool = TruncatedBernoulli(train_kwargs, store=store)
  # fit to dataset
  trunc_bool.fit(S)
  # close store 
  store.close()

stats
=====

TruncatedLinearRegression:
--------------------------
``TruncatedLinearRegression`` learns from truncated linear regression model's with the noise 
variance is known or unknown. In the known setting we use the algorithm described in the following
paper: `Computationally and Statistically Efficient Truncated Regression <https://arxiv.org/abs/2010.12000>`_. When 
the variance of the ground-truth linear regression's model is unknown, we use the algorithm described in 
the following paper: `Efficient Truncated Linear Regression with Unknown Noise Variance`.

When evaluating truncated regression models, the user needs three things; an oracle, a Callable that 
indicates whether a sample falls within the truncation set, the model's ``alpha``, survival probability, and the ``TruncatedLinearRegression`` module.  The ``TruncatedLinearRegression`` module accepts 
a parameters object that the user can define for running the PSGD procedure.

Parameters:
~~~~~~~~~~

* ``args`` (delphi.utils.Parameters): parameters object that holds hyperparameters for experiment. Possible hyperparameters include:

  * ``phi`` (Callable): required argument; callable class that receives num_samples by 1 input ``torch.Tensor``, and returns a num_samples by 1 outputs a num_samples by 1 ``Tensor`` with ``(0, 1)`` representing membership in ``S`` or not.
  * ``alpha`` (float): required argument; survivial probability for truncated regression
  * ``epochs`` (int): maximum number of times to iterate over dataset
  * ``noise_var`` (float): provide noise variance, if the noise variance for the truncated regression model is known, else unknown variance procedure is run by default
  * ``fit_intercept`` (bool): whether to fit the intercept or not; default to True
  * ``trials`` (int): maximum number of trials to perform PSGD; after trials, model with smallest loss on the dataset is returned
  * ``val`` (float): percentage of dataset to use for validation set; default .2
  * ``lr`` (float): initial learning rate to use for regression weights; default 1e-1
  * ``var_lr`` (float): initial learning rate to use variance parameters, when running unknown variance 
  * ``step_lr`` (int): number of gradient steps to take before adjusting learning rate by value ``step_lr_gamma``; default 100
  * ``step_lr_gamma`` (float): amount to adjust learning rate, every ``step_lr`` steps ``new_lr = curr_lr * step_lr_gamma``
  * ``custom_lr_multiplier`` (str): `cosine` or `cyclic` for cosine annealing learning rate scheduling or cyclic learning rate scheduling; default None
  * ``momentum`` (float): momentum; default 0.0 
  * ``adam`` (bool): use adam adaptive learning rate optimizer; default False
  * ``eps`` (float): epsilon denominator for gradients (ie. to prevent divide by zero calculations); default 1e-5
  * ``r`` (float): initial projection set radius; default 1.0
  * ``rate`` (float): at the end of each trial, the projection set radius is increased at rate `rate`; default 1.5
  * ``normalize`` (bool): our methods assume that the :math:`max(||x_{i}||_{2}) <= 1`, so before running the procedure, you must  divide the input featurers :math:`X = {x_{(1)}, x_{(2)}, ... , x_{(n)}}` by :math:`\max(||x_{i}||_{2}) \dot \sqrt(k)`, where :math:`k` represents the number of dimensions the input features have; by default the procedure normalizes the features for the user
  * ``batch_size`` (int): the number of samples to use for each gradient step; default 50
  * ``tol`` (float): if using early stopping, threshold for when to stop; default 1e-3
  * ``workers`` (int): number of workers to use for procedure; default 1
  * ``num_samples`` (int): number of samples to sample from distribution in gradient for each sample in batch (ie. if batch size is 10, and num_samples is 100, the each gradient step with sample 100 * 10 samples from a gaussian distribution); default 50
  * ``early_stopping`` (bool): whether to check loss for convergence; compares the best avg validation loss at the end of an epoch, with current avg epoch loss estimate, if :math:`best_loss - curr_loss < tol` for `n_iter_no_change`, then procedure terminates; default False
  * ``n_iter_no_change`` (int): number of iterations to check for change before declaring convergence; default 5
  * ``verbose`` (bool): whether to print a verbose output with loss logs, etc.; default False 

* ``store`` (cox.store.Store): logging object to keep track regression's train and validation losses   

Attributes:
  • coef_ (torch.Tensor): regression weight coefficients
  • intercept_ (torch.Tensor): regression intercept term
  • variance_ (torch.Tensor): if the noise variance is unknown, this property provides its estimate

In the following code block, here, we show an example of how to use the library with unknown noise variance:

.. code-block:: python

from delphi.stats.truncated_linear_regression import TruncatedLinearRegression from delphi import oracle from delphi.utils.helpers import Parameters from cox.store import Store

OUT_DIR = 'PATH_TO_EXPERIMENT_LOGGING_DIRECTORY' store = Store(OUT_DIR)

left truncate linear regression at 0 (ie. S = {y >= 0 for all (x, y) in S})

phi = oracle.Left_Regression(0.0)

pass algorithm parameters in through Parameters object

train_kwargs = Parameters({'phi': phi, 'alpha': alpha})

define trunc linear regression object

trunc_reg = TruncatedLinearRegression(train_kwargs, store=store)

fit to dataset

trunc_reg.fit(X, y)

close store

store.close()

make predictions with new regression

print(trunc_reg.predict(X))

Methods:


* ``predict(X)``: predict regression points for input feature matrix X (num_samples by features)

TruncatedLassoRegression:
--------------------------
``TruncatedLassoRegression`` learns from truncated LASSO regression model's with the noise 
variance is known. In the known setting we use the algorithm described in the following
paper `Truncated Linear Regression in High Dimensions <https://arxiv.org/abs/2007.14539>`_

When evaluating truncated lasso regression models, the user needs three things; an oracle, a Callable that 
indicates whether a sample falls within the truncation set, the model's ``alpha``, survival probability, and the ``TruncatedLassoRegression`` module. The ``TruncatedLassoRegression`` module accepts 
a parameters object that the user can define for running the PSGD procedure.

Parameters:
  • args (delphi.utils.Parameters): parameters object that holds hyperparameters for experiment. Possible hyperparameters include:

    • phi (Callable): required argument; callable class that receives num_samples by 1 input torch.Tensor, and returns a num_samples by 1 outputs a num_samples by 1 Tensor with (0, 1) representing membership in S or not.
    • alpha (float): required argument; survivial probability for truncated regression
    • l1 (float): l1 regularization
    • epochs (int): maximum number of times to iterate over dataset
    • noise_var (float): provide noise variance, if the noise variance for the truncated regression model is known, else unknown variance procedure is run by default
    • fit_intercept (bool): whether to fit the intercept or not; default to True
    • trials (int): maximum number of trials to perform PSGD; after trials, model with smallest loss on the dataset is returned
    • val (float): percentage of dataset to use for validation set; default .2
    • lr (float): initial learning rate to use for regression weights; default 1e-1
    • var_lr (float): initial learning rate to use variance parameters, when running unknown variance
    • step_lr (int): number of gradient steps to take before adjusting learning rate by value step_lr_gamma; default 100
    • step_lr_gamma (float): amount to adjust learning rate, every step_lr steps new_lr = curr_lr * step_lr_gamma
    • custom_lr_multiplier (str): cosine or cyclic for cosine annealing learning rate scheduling or cyclic learning rate scheduling; default None
    • momentum (float): momentum; default 0.0
    • adam (bool): use adam adaptive learning rate optimizer; default False
    • eps (float): epsilon denominator for gradients (ie. to prevent divide by zero calculations); default 1e-5
    • r (float): initial projection set radius; default 1.0
    • rate (float): at the end of each trial, the projection set radius is increased at rate rate; default 1.5
    • normalize (bool): our methods assume that the :math:max(||x_{i}||_{2}) <= 1, so before running the procedure, you must divide the input featurers :math:X = \{x_{(1)}, x_{(2)}, ... , x_{(n)}\} by :math:max(||x_{i}||_{2}) \dot \sqrt(k), where :math:k represents the number of dimensions the input features have; by default the procedure normalizes the features for the user
    • batch_size (int): the number of samples to use for each gradient step; default 50
    • tol (float): if using early stopping, threshold for when to stop; default 1e-3
    • workers (int): number of workers to use for procedure; default 1
    • num_samples (int): number of samples to sample from distribution in gradient for each sample in batch (ie. if batch size is 10, and num_samples is 100, the each gradient step with sample 100 * 10 samples from a gaussian distribution); default 50
    • early_stopping (bool): whether to check loss for convergence; compares the best avg validation loss at the end of an epoch, with current avg epoch loss estimate, if :math:best_loss - curr_loss < tol for n_iter_no_change, then procedure terminates; default False
    • n_iter_no_change (int): number of iterations to check for change before declaring convergence; default 5
    • verbose (bool): whether to print a verbose output with loss logs, etc.; default False
  • store (cox.store.Store): logging object to keep track lasso regression's train and validation losses

Attributes:


* ``coef_`` (torch.Tensor): regression weight coefficients 
* ``intercept_`` (torch.Tensor): regression intercept term 
* ``variance_`` (torch.Tensor): if the noise variance is unknown, this property provides its estimate

In the following code block, here, we show an example of how to use the truncated lasso regression module with known noise variance: 
   
.. code-block:: python
  
  from delphi.stats.truncated_lasso_regression import TruncatedLassoRegression
  from delphi import oracle  
  from delphi.utils.helpers import Parameters
  from cox.store import Store

  OUT_DIR = 'PATH_TO_EXPERIMENT_LOGGING_DIRECTORY'
  store = Store(OUT_DIR)

  # left truncate lasso regression at 0 (ie. S = {y>= 0 for all (x, y) in S})
  phi = oracle.Left_Regression(0.0)
  # pass algorithm parameters in through Parameters object
  train_kwargs = Parameters({'phi': phi, 
                            'alpha': alpha, 
                            'noise_var': 1.0})
  # define trunc linear LASSO regression object
  trunc_lasso_reg = TruncatedLassoRegression(train_kwargs, store=store)
  # fit to dataset
  trunc_lasso_reg.fit(X, y)
  # close store 
  store.close()
  # make predictions with new regression
  print(trunc_lasso_reg.predict(X))

Methods: 
~~~~~~~~

* ``predict(X)``: predict regression points for input feature matrix X (num_samples by features)

TruncatedRidgeRegression:
--------------------------
``TruncatedRidgeRegression`` learns from truncated ridge regression model's when the noise 
variance is known or unknown. 

When evaluating truncated ridge regression models, the user needs three things; an oracle, a Callable that 
indicates whether a sample falls within the truncation set, the model's ``alpha``, survival probability, and the ``TruncatedRidgeRegression`` module. The ``TruncatedRidgeRegression`` module accepts 
a parameters object that the user can define for running the PSGD procedure.

Parameters:
  • args (delphi.utils.Parameters): parameters object that holds hyperparameters for experiment. Possible hyperparameters include:

    • phi (Callable): required argument; callable class that receives num_samples by 1 input torch.Tensor, and returns a num_samples by 1 outputs a num_samples by 1 Tensor with (0, 1) representing membership in S or not.
    • alpha (float): required argument; survivial probability for truncated regression
    • weight_decay (float): weight decay regularization
    • epochs (int): maximum number of times to iterate over dataset
    • noise_var (float): provide noise variance, if the noise variance for the truncated regression model is known, else unknown variance procedure is run by default
    • fit_intercept (bool): whether to fit the intercept or not; default to True
    • trials (int): maximum number of trials to perform PSGD; after trials, model with smallest loss on the dataset is returned
    • val (float): percentage of dataset to use for validation set; default .2
    • lr (float): initial learning rate to use for regression weights; default 1e-1
    • var_lr (float): initial learning rate to use variance parameters, when running unknown variance
    • step_lr (int): number of gradient steps to take before adjusting learning rate by value step_lr_gamma; default 100
    • step_lr_gamma (float): amount to adjust learning rate, every step_lr steps new_lr = curr_lr * step_lr_gamma
    • custom_lr_multiplier (str): cosine or cyclic for cosine annealing learning rate scheduling or cyclic learning rate scheduling; default None
    • momentum (float): momentum; default 0.0
    • adam (bool): use adam adaptive learning rate optimizer; default False
    • eps (float): epsilon denominator for gradients (ie. to prevent divide by zero calculations); default 1e-5
    • r (float): initial projection set radius; default 1.0
    • rate (float): at the end of each trial, the projection set radius is increased at rate rate; default 1.5
    • normalize (bool): our methods assume that the :math:max(||x_{i}||_{2}) <= 1, so before running the procedure, you must divide the input featurers :math:X = \{x_{(1)}, x_{(2)}, ... , x_{(n)}\} by :math:max(||x_{i}||_{2}) \dot \sqrt(k), where :math:k represents the number of dimensions the input features have; by default the procedure normalizes the features for the user
    • batch_size (int): the number of samples to use for each gradient step; default 50
    • tol (float): if using early stopping, threshold for when to stop; default 1e-3
    • workers (int): number of workers to use for procedure; default 1
    • num_samples (int): number of samples to sample from distribution in gradient for each sample in batch (ie. if batch size is 10, and num_samples is 100, the each gradient step with sample 100 * 10 samples from a gaussian distribution); default 50
    • early_stopping (bool): whether to check loss for convergence; compares the best avg validation loss at the end of an epoch, with current avg epoch loss estimate, if :math:best_loss - curr_loss < tol for n_iter_no_change, then procedure terminates; default False
    • n_iter_no_change (int): number of iterations to check for change before declaring convergence; default 5
    • verbose (bool): whether to print a verbose output with loss logs, etc.; default False
  • store (cox.store.Store): logging object to keep track lasso regression's train and validation losses

Attributes:


* ``coef_`` (torch.Tensor): regression weight coefficients 
* ``intercept_`` (torch.Tensor): regression intercept term 
* ``variance_`` (torch.Tensor): if the noise variance is unknown, this property provides its estimate

In the following code block, here, we show an example of how to use the truncated lasso regression module with known noise variance: 
   
.. code-block:: python
  
  from delphi.stats.truncated_ridge_regression import TruncatedRidgeRegression
  from delphi import oracle  
  from delphi.utils.helpers import Parameters
  from cox.store import Store

  OUT_DIR = 'PATH_TO_EXPERIMENT_LOGGING_DIRECTORY'
  store = Store(OUT_DIR)

  # left truncate lasso regression at 0 (ie. S = {y>= 0 for all (x, y) in S})
  phi = oracle.Left_Regression(0.0)
  # pass algorithm parameters in through Parameters object
  train_kwargs = Parameters({'phi': phi, 
                            'alpha': alpha, 
                            'weight_decay': .01,
                            'noise_var': 1.0})
  # define trunc linear LASSO regression object
  trunc_ridge_reg = TruncatedRidgeRegression(train_kwargs, store=store)
  # fit to dataset
  trunc_ridge_reg.fit(X, y)
  # close store 
  store.close()
  # make predictions with new regression
  print(trunc_ridge_reg.predict(X))

Methods: 
~~~~~~~~

* ``predict(X)``: predict regression points for input feature matrix X (num_samples by features)

TruncatedElasticNetRegression:
--------------------------
``TruncatedElasticNetRegression`` learns from truncated elastic net regression model's when the noise 
variance is known or unknown. 

When evaluating truncated elastic net regression models, the user needs three things; an oracle, a Callable that 
indicates whether a sample falls within the truncation set, the model's ``alpha``, survival probability, and the ``TruncatedElasticNetRegression`` module. The ``TruncatedRidgeRegression`` module accepts 
a parameters object that the user can define for running the PSGD procedure.

Parameters:
  • args (delphi.utils.Parameters): parameters object that holds hyperparameters for experiment. Possible hyperparameters include:

    • phi (Callable): required argument; callable class that receives num_samples by 1 input torch.Tensor, and returns a num_samples by 1 outputs a num_samples by 1 Tensor with (0, 1) representing membership in S or not.
    • alpha (float): required argument; survivial probability for truncated regression
    • weight_decay (float): weight decay regularization
    • l1 (float): l1 regularization
    • epochs (int): maximum number of times to iterate over dataset
    • noise_var (float): provide noise variance, if the noise variance for the truncated regression model is known, else unknown variance procedure is run by default
    • fit_intercept (bool): whether to fit the intercept or not; default to True
    • trials (int): maximum number of trials to perform PSGD; after trials, model with smallest loss on the dataset is returned
    • val (float): percentage of dataset to use for validation set; default .2
    • lr (float): initial learning rate to use for regression weights; default 1e-1
    • var_lr (float): initial learning rate to use variance parameters, when running unknown variance
    • step_lr (int): number of gradient steps to take before adjusting learning rate by value step_lr_gamma; default 100
    • step_lr_gamma (float): amount to adjust learning rate, every step_lr steps new_lr = curr_lr * step_lr_gamma
    • custom_lr_multiplier (str): cosine or cyclic for cosine annealing learning rate scheduling or cyclic learning rate scheduling; default None
    • momentum (float): momentum; default 0.0
    • adam (bool): use adam adaptive learning rate optimizer; default False
    • eps (float): epsilon denominator for gradients (ie. to prevent divide by zero calculations); default 1e-5
    • r (float): initial projection set radius; default 1.0
    • rate (float): at the end of each trial, the projection set radius is increased at rate rate; default 1.5
    • normalize (bool): our methods assume that the :math:max(||x_{i}||_{2}) <= 1, so before running the procedure, you must divide the input featurers :math:X = \{x_{(1)}, x_{(2)}, ... , x_{(n)}\} by :math:max(||x_{i}||_{2}) \dot \sqrt(k), where :math:k represents the number of dimensions the input features have; by default the procedure normalizes the features for the user
    • batch_size (int): the number of samples to use for each gradient step; default 50
    • tol (float): if using early stopping, threshold for when to stop; default 1e-3
    • workers (int): number of workers to use for procedure; default 1
    • num_samples (int): number of samples to sample from distribution in gradient for each sample in batch (ie. if batch size is 10, and num_samples is 100, the each gradient step with sample 100 * 10 samples from a gaussian distribution); default 50
    • early_stopping (bool): whether to check loss for convergence; compares the best avg validation loss at the end of an epoch, with current avg epoch loss estimate, if :math:best_loss - curr_loss < tol for n_iter_no_change, then procedure terminates; default False
    • n_iter_no_change (int): number of iterations to check for change before declaring convergence; default 5
    • verbose (bool): whether to print a verbose output with loss logs, etc.; default False
  • store (cox.store.Store): logging object to keep track lasso regression's train and validation losses

Attributes:


* ``coef_`` (torch.Tensor): regression weight coefficients 
* ``intercept_`` (torch.Tensor): regression intercept term 
* ``variance_`` (torch.Tensor): if the noise variance is unknown, this property provides its estimate

In the following code block, here, we show an example of how to use the truncated lasso regression module with known noise variance: 
   
.. code-block:: python
  
  from delphi.stats.truncated_elastic_net_regression import TruncatedElasticNetRegression
  from delphi import oracle  
  from delphi.utils.helpers import Parameters
  from cox.store import Store

  OUT_DIR = 'PATH_TO_EXPERIMENT_LOGGING_DIRECTORY'
  store = Store(OUT_DIR)

  # left truncate lasso regression at 0 (ie. S = {y>= 0 for all (x, y) in S})
  phi = oracle.Left_Regression(0.0)
  # pass algorithm parameters in through Parameters object
  train_kwargs = Parameters({'phi': phi, 
                            'alpha': alpha, 
                            'weight_decay': .01,
                            'noise_var': 1.0})
  # define trunc linear LASSO regression object
  trunc_elastic_reg = TruncatedRidgeRegression(train_kwargs, store=store)
  # fit to dataset
  trunc_elastic_reg.fit(X, y)
  # close store 
  store.close()
  # make predictions with new regression
  print(trunc_elastic_reg.predict(X))

Methods: 
~~~~~~~~

* ``predict(X)``: predict regression points for input feature matrix X (num_samples by features)

TruncatedLogisticRegression:
--------------------------
``TruncatedLogisticRegression`` learns truncated logistic regression models by maximizing the truncated log likelihood.
The algorithm that we use for this procedure is described in the following
paper `A Theoretical and Practical Framework for Classification and Regression from Truncated Samples <https://proceedings.mlr.press/v108/ilyas20a.html>`_.
.

When evaluating truncated logistic regression models, the user needs three things; an oracle, a Callable that 
indicates whether a sample falls within the truncation set, the model's ``alpha``, survival probability, and the ``TruncatedLogisticRegression`` module. The ``TruncatedLogisticRegression`` module accepts 
a parameters object that the user can define for running the PSGD procedure. 

Parameters:
-----------

* ``args`` (delphi.utils.Parameters): parameters object that holds hyperparameters for experiment. Possible hyperparameters include:

  * ``phi`` (Callable): required argument; callable class that receives num_samples by 1 input ``torch.Tensor``, and returns a num_samples by 1 outputs a num_samples by 1 ``Tensor`` with ``(0, 1)`` representing membership in ``S`` or not.
  * ``alpha`` (float): required argument; survivial probability for truncated regression
  * ``epochs`` (int): maximum number of times to iterate over dataset
  * ``fit_intercept`` (bool): whether to fit the intercept or not; default to True
  * ``trials`` (int): maximum number of trials to perform PSGD; after trials, model with smallest loss on the dataset is returned
  * ``val`` (float): percentage of dataset to use for validation set; default .2
  * ``lr`` (float): initial learning rate to use for regression weights; default 1e-1
  * ``var_lr`` (float): initial learning rate to use variance parameters, when running unknown variance 
  * ``step_lr`` (int): number of gradient steps to take before adjusting learning rate by value ``step_lr_gamma``; default 100
  * ``step_lr_gamma`` (float): amount to adjust learning rate, every ``step_lr`` steps ``new_lr = curr_lr * step_lr_gamma``
  * ``custom_lr_multiplier`` (str): `cosine` or `cyclic` for cosine annealing learning rate scheduling or cyclic learning rate scheduling; default None
  * ``momentum`` (float): momentum; default 0.0 
  * ``adam`` (bool): use adam adaptive learning rate optimizer; default False
  * ``eps`` (float): epsilon denominator for gradients (ie. to prevent divide by zero calculations); default 1e-5
  * ``r`` (float): initial projection set radius; default 1.0
  * ``rate`` (float): at the end of each trial, the projection set radius is increased at rate `rate`; default 1.5
  * ``normalize`` (bool): our methods assume that the :math:`max(||x_{i}||_{2}) <= 1`, so before running the procedure, you must  divide the input featurers :math:`X = {x_{(1)}, x_{(2)}, ... , x_{(n)}}` by :math:`max(||x_{i}||_{2}) \dot \sqrt(k)`, where :math:`k` represents the number of dimensions the input features have; by default the procedure normalizes the features for the user
  * ``batch_size`` (int): the number of samples to use for each gradient step; default 50
  * ``tol`` (float): if using early stopping, threshold for when to stop; default 1e-3
  * ``workers`` (int): number of workers to use for procedure; default 1
  * ``num_samples`` (int): number of samples to sample from distribution in gradient for each sample in batch (ie. if batch size is 10, and num_samples is 100, the each gradient step with sample 100 * 10 samples from a gaussian distribution); default 50
  * ``early_stopping`` (bool): whether to check loss for convergence; compares the best avg validation loss at the end of an epoch, with current avg epoch loss estimate, if :math:`best_loss - curr_loss < tol` for `n_iter_no_change` epochs, then procedure terminates; default False
  * ``n_iter_no_change`` (int): number of iterations to check for change before declaring convergence; default 5
  * ``verbose`` (bool): whether to print a verbose output with loss logs, etc.; default False - just a tdqm output

* ``store`` (cox.store.Store): logging object to keep track logistic regression's train and validation losses and accuracy   

Attributes:
  • coef_ (torch.Tensor): regression weight coefficients
  • intercept_ (torch.Tensor): regression intercept term

In the following code block, here, we show an example of how to use the truncated logistic regression module:

.. code-block:: python

from delphi.stats.truncated_logistic_regression import TruncatedLogisticRegression from delphi import oracle from delphi.utils.helpers import Parameters from cox.store import Store

OUT_DIR = 'PATH_TO_EXPERIMENT_LOGGING_DIRECTORY' store = Store(OUT_DIR)

left truncate logistic regression at 0 (ie. S = {z >= -.1 for all (x, y) in S})

phi = oracle.Left_Regression(-0.1)

pass algorithm parameters in through parameter object

train_kwargs = Parameters({'phi': phi, 'alpha': alpha})

define truncated logistic regression object

trunc_log_reg = TruncatedLogisticRegression(train_kwargs, store=store)

fit to dataset

trunc_log_reg.fit(X, y)

close store

store.close()

make predictions with new regression

print(trunc_log_reg.predict(X))

Methods:


* ``predict(X)``: predict classification for input feature matrix X (num_samples by features)


TruncatedProbitRegression:
--------------------------
``TruncatedProbitRegression`` learns truncated probit regression models, by maximizing the truncated log likelihood.
The algorithm that we use for this procedure is described in the following
paper `A Theoretical and Practical Framework for Classification and Regression from Truncated Samples <https://proceedings.mlr.press/v108/ilyas20a.html>`_.

When evaluating truncated logistic regression models, the user needs three things; an oracle, a Callable that 
indicates whether a sample falls within the truncation set, the model's ``alpha``, survival probability, and ``TruncatedProbitRegression`` module.  The ``TruncatedProbitRegression`` module accepts 
a parameters object that the user can define for running the PSGD procedure.

Parameters:
-----------

* ``args`` (delphi.utils.Parameters): parameters object that holds hyperparameters for experiment. Possible hyperparameters include:

  * ``phi`` (Callable): required argument; callable class that receives num_samples by 1 input ``torch.Tensor``, and returns a num_samples by 1 outputs a num_samples by 1 ``Tensor`` with ``(0, 1)`` representing membership in ``S`` or not.
  * ``alpha`` (float): required argument; survivial probability for truncated regression
  * ``epochs`` (int): maximum number of times to iterate over dataset
  * ``fit_intercept`` (bool): whether to fit the intercept or not; default to True
  * ``trials`` (int): maximum number of trials to perform PSGD; after trials, model with smallest loss on the dataset is returned
  * ``val`` (float): percentage of dataset to use for validation set; default .2
  * ``lr`` (float): initial learning rate to use for regression weights; default 1e-1
  * ``step_lr`` (int): number of gradient steps to take before adjusting learning rate by value ``step_lr_gamma``; default 100
  * ``step_lr_gamma`` (float): amount to adjust learning rate, every ``step_lr`` steps ``new_lr = curr_lr * step_lr_gamma``
  * ``custom_lr_multiplier`` (str): `cosine` or `cyclic` for cosine annealing learning rate scheduling or cyclic learning rate scheduling; default None
  * ``momentum`` (float): momentum; default 0.0 
  * ``adam`` (bool): use adam adaptive learning rate optimizer; default False
  * ``eps`` (float): epsilon denominator for gradients (ie. to prevent divide by zero calculations); default 1e-5
  * ``r`` (float): initial projection set radius; default 1.0
  * ``rate`` (float): at the end of each trial, the projection set radius is increased at rate `rate`; default 1.5
  * ``normalize`` (bool): our methods assume that the :math:`max(||x_{i}||_{2}) <= 1`, so before running the procedure, you must  divide the input featurers :math:`X = \{x_{(1)}, x_{(2)}, ... , x_{(n)}\}` by :math:`max(||x_{i}||_{2}) \dot \sqrt(k)`, where :math:`k` represents the number of dimensions the input features have; by default the procedure normalizes the features for the user
  * ``batch_size`` (int): the number of samples to use for each gradient step; default 50
  * ``tol`` (float): if using early stopping, threshold for when to stop; default 1e-3
  * ``workers`` (int): number of workers to use for procedure; default 1
  * ``num_samples`` (int): number of samples to sample from distribution in gradient for each sample in batch (ie. if batch size is 10, and num_samples is 100, the each gradient step with sample 100 * 10 samples from a gaussian distribution); default 50
  * ``early_stopping`` (bool): whether to check loss for convergence; compares the best avg validation loss at the end of an epoch, with current avg epoch loss estimate, if :math:`best_loss - curr_loss < tol` for `n_iter_no_change`, then procedure terminates; default False
  * ``n_iter_no_change`` (int): number of iterations to check for change before declaring convergence; default 5
  * ``verbose`` (bool): whether to print a verbose output with loss logs, etc.; default False 

* ``store`` (cox.store.Store): logging object to keep track probit regression's train and validation losses and accuracy 

Attributes:
  • coef_ (torch.Tensor): regression weight coefficients
  • intercept_ (torch.Tensor): regression intercept term

In the following code block, here, we show an example of how to use the truncated probit regression module:

.. code-block:: python

from delphi.stats.truncated_probit_regression import TruncatedProbitRegression from delphi import oracle from delphi.utils.helpers import Parameters from cox.store import Store

OUT_DIR = 'PATH_TO_EXPERIMENT_LOGGING_DIRECTORY' store = Store(OUT_DIR)

left truncate probit regression at 0 (ie. S = {z >= -0.1 for all (x, y) in S})

phi = oracle.Left_Regression(-0.1)

pass algorithm parameters in through dictionary

train_kwargs = Parameters({'phi': phi, 'alpha': alpha})

define truncated probit regression object

trunc_prob_reg = TruncatedProbitRegression(train_kwargs, store=store)

fit to dataset

trunc_prob_reg.fit(X, y)

close store

store.close()

make predictions with new regression

print(trunc_prob_reg.predict(X))

Methods:


* ``predict(X)``: predict classification for input feature matrix X (num_samples by features)


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

delphi.ai-0.2.2.0.tar.gz (65.5 kB view hashes)

Uploaded Source

Built Distribution

delphi.ai-0.2.2.0-py3-none-any.whl (95.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page