Skip to main content

Python package for Mathematical Statistics with a View Toward Machine Learning, by John Myers

Project description

Utility functions for "Mathematical Statistics with a View Toward Machine Learning", by John Myers

A Python package for all utility functions used in the book and programming assignments. To install, do the usual:

pip install math_stats_ml

All other materials are contained here.

Table of contents:

  1. gd submodule: Gradient descent utilities

gd submodule: Gradient descent utilities

Contains all utilities for the gradient descent algorithms used in the book.

GD_output class: Container class for output of algorithms

class GD_output

A class holding the outputs of both the gradient descent (GD) and stochastic gradient descent (SGD) algorithms. All attributes below are optional and default to None.

Attributes

Name Type Description
parameters dict A dictionary containing the parameters of the objective function passed to either GD or SGD. Each value in the dictionary is a tensor whose zero-th dimension indexes the number of gradient steps.
per_step_objectives torch.Tensor A tensor containing the running objective values, per gradient step.
per_epoch_objectives torch.Tensor A tensor containing the running mean objective values, per epoch.
epoch_step_nums torch.Tensor A tensor containing the number of each gradient step on which an epoch begins/ends.
grad_steps iter An iterable ranging from $0$ to one less than the total number of gradient steps. (This is convenient for plotting purposes.)
lr float Learning rate.
num_steps int Number of gradient steps to run the gradient descent (GD) algorithm.
decay_rate float Learning rate decay.
beta1 float Hyperparameter for ADAM optimization algorithm.
beta2 float Hyperparameter for ADAM optimization algorithm.
batch_size int Mini-batch size for the stochastic gradient descent (SGD) algorithm.
num_epochs int Number of epochs for the stochastic gradient descent (SGD) algorithm.
max_steps int Maximum number of gradient steps after which we terminate the stochastic gradient descent (SGD) algorithm.
type_flag str Either gd, sgd, or adam indicating the optimization algorithm.

GD function: Gradient descent

GD(J, init_parameters, lr, num_steps, decay_rate=0)

Implementation of gradient descent. The notation below is intended to match the notation in the description in the book.

Output

The output type is an object of the GD_output class.

Parameters

Name Type Description
J function Objective function to be minimized. The parameters of the function are either a single tensor or a dictionary of tensors (in the case that the parameters fall into natural groups, e.g., weights and biases).
init_parameters torch.Tensor or dict Initial parameters.
lr float Learning rate, corresponding to $\alpha$ in the book.
num_steps int The number of gradient steps after which the algorithm should halt, corresponding to $N$ in the book.
decay_rate float Learning rate decay, corresponding to $\beta$ in the book. Defaults to 0.

SGD function: Stochastic gradient descent

SGD(L, init_parameters, X, lr, batch_size, num_epochs, y=None, kind='sgd', beta1=0.9, beta2=0.999, epsilon=1e-8, decay_rate=0, max_steps=-1, shuffle=True, random_state=None)

Implementation of both the vanilla stochastic gradient descent algorithm, and the ADAM optimization algorithm. The notation and terminology below is intended to match the book.

Output

The output type is an object of the GD_output class.

Parameters

Name Type Description
L function Loss function for the algorithm. The call signature of the function is of the form L(parameters, x) or L(parameters, x, y), where x is a feature vector and y is an (optional) ground truth label of a single instance in the dataset, and parameters is either a single parameter tensor or a dictionary of parameter tensors (in the case that the parameters fall into natural groups, e.g., weights and biases). We assume that L is "vectorized," so that it may accept a design matrix X in place of x and an entire vector of ground truth labels for y.
init_parameters torch.Tensor or dict Initial parameters.
X torch.Tensor Design matrix. The rows are the feature vectors that are fed into the loss function L.
lr float Learning rate, corresponding to $\alpha$ in the book.
batch_size int Mini-batch size, corresponding to $k$ in the book.
num_epochs int The number of epochs after which the algorithm should halt, corresponding to $N$ in the book.
y torch.Tensor Vector of ground truth labels for the data in the design matrix X. Optional, defaults to None.
kind str Type of optimization algorithm. Either sgd (default) for vanilla stochastic gradient descent, or adam for the ADAM optimization algorithm.
beta1 float Hyperparameter for the ADAM optimization algorithm. Defaults to 0.9.
beta2 float Hyperparameter for the ADAM optimization algorithm. Defaults to 0.999.
epsilon float Hyperparameter for the ADAM optimization algorithm. Defaults to 1e-8.
decay_rate float Learning rate decay, corresponding to $\beta$ in the book. Defaults to 0.
max_steps int Maximum number of gradient steps after which the algorithm should halt. Defaults to -1, in which case the algorithm will complete all num_epochs many epochs.
shuffle bool Determines whether to shuffle the dataset before looping through an epoch. Defaults to True.
random_state int If not None and shuffle=True, random seed to be passed to torch.manual_seed. Defaults to None.

plot_gd function: plot the output of gradient descent

plot_gd( gd_output, log=False, w=5, h=4, plot_title=True, plot_title_string="gradient descent", parameter_title=True, show_step=True, show_epoch=True, show_xlabel=True, xlabel="gradient steps", show_ylabel=True, ylabel="objective", legend=False, per_step_alpha=0.25, per_step_color=None, per_step_label=None, per_epoch_color=None, per_epoch_label=None, ax=None)

Descriptions coming later...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

math_stats_ml-0.0.18.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

math_stats_ml-0.0.18-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file math_stats_ml-0.0.18.tar.gz.

File metadata

  • Download URL: math_stats_ml-0.0.18.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.5

File hashes

Hashes for math_stats_ml-0.0.18.tar.gz
Algorithm Hash digest
SHA256 3529733069949d211f74cde156d7b660b6b79294a4b7ac4244c16fe4fc8b8334
MD5 6ee961d7cd047dcfc55539900e82e731
BLAKE2b-256 4c73b89a985a0f35825468b8f62af322e3f73c6d268e26b164dd9346cd73cb00

See more details on using hashes here.

File details

Details for the file math_stats_ml-0.0.18-py3-none-any.whl.

File metadata

  • Download URL: math_stats_ml-0.0.18-py3-none-any.whl
  • Upload date:
  • Size: 10.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.5

File hashes

Hashes for math_stats_ml-0.0.18-py3-none-any.whl
Algorithm Hash digest
SHA256 90f00c0e29c0a2c4c302068cb47d15659bf25ebfd9bceab87198dfd9b0e024ae
MD5 fa9a0620cb627272197250f7b82d0fdb
BLAKE2b-256 fe5cd65358efbd1090bcf2497ebe5381f8503501af6fe2ef6989a332a88ad952

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page