Skip to main content

Perform automatic differentiation

Project description

Adifpy

Table of Contents

Introduction

This software is aimed at allowing users to evaluate and differentiate their function. This is a powerful tool that will allow users to find solutions to complex derivations and visualize their results. It serves as an efficient way to take derivatives when compared to other solutions such as symbolic differentiation.

Applications are widespread, ranging from graphing simple functions to taking the derivative of complex, high dimension functions that can have widespread uses such as in optimization problems, machine learning, and data analysis.

Background

Traditional methods for differentiation include symbolic differentiation and numerical differentiation. Each of these techniques brings its own challenges when used for computational science - symbolic differentiation requires converting complex computer programs into simple components and often results in complex and cryptic expressions, while numerical differentiation is susceptible to floating point and rounding errors.

Automatic differentiation (AD) solves these problems: any mathematical function (for which a derivative is needed) can be broken down into a series of constituent elementary (binary and unary) operations, executed in a specific order on a predetermined set of inputs. A technique for visualizing the sequence of operations corresponding to the function is the computational graph, with nodes representing intermediate variables and lines leaving from nodes representing operations used on intermediate variables. AD combines the known derivatives of the constituent elementary operations (e.g. arithmetic and transcendental functions) via the chain rule to find the derivative of the overall composition.

For example, for the hypothetical function equation, where equation all represent elementary operations, we can pose equation. The desired output is equation, and by the chain rule and simple derivatives, we obtain:

Our implementation of AD uses dual numbers to calculate derivatives of individual components. Dual numbers have real and dual components, taking the form equation and equation and where a and b are real. By the Taylor series expansion of a function around a point, notice that evaluating a function at equation yields:

Hence, by evaluating the function at the desired point equation, the outputted real and dual components are the function evaluated at a and derivative of the function evaluated at a respectively. This is an efficient way of calculating requisite derivatives.

How to Use

First, ensure that you are using Python 3.10 or newer. All future steps can/should be completed in a virtual environment so as not to pollute your base Python installation. To create and activate a new virtual environment, use the following:

python3 -m venv [desired/path/to/venv]
source [desired/path/to/venv]/bin/activate

Next, clone the package from this GitHub repository and install the needed dependencies and the package:

git clone https://code.harvard.edu/CS107/team33.git
python3 -m pip install -r requirements.txt
python3 -m pip install .

Now, you're ready to use the package. Continue to the Example to test our the package!

Example

First, import the package in your python code:

import Adifpy

and create an Evaluator object, which takes a callable function as an argument:

evaluator = Adifpy.Evaluator(lambda x : x**2)

Next, we want to find the value and derivative of the function at a point (currently, only scalar functions with 1 input and 1 output are supported). We can use the Evaluator's eval function, passing in the point at which you want to evaluate (and optionally, a scalar seed vector):

output = evaluator.eval(3)

This function returns a tuple, in the form (value, derivative), where the value is the evaluation of the function at that point (in this case, 9) and the derivative is the derivative of the function at that point (in this case, 6).

Additionally a seed vector (for now, only scalars such as type int or float are supported) can be passed to take the derivative with respect to a different seed vector. For example, if you want to take the derivative with respect to a seed vector of 2 you could call the following:

output2 = evaluator.eval(3, seed_vector=2)

which would return (9,12) (since the directional derivative is in the same direction, with twice the magnitude).

Software Organization

The following section outlines our plans for organizing the package directory, sub-packages, modules, classes, and deployment.

Directory Structure

adifpy/
├── docs
│   └── milestone1
│   └── milestone2
│   └── milestone2_progress
│   └── documentation
├── LICENSE
├── README.md
├── requirements.txt
├── pyproject.toml
├── Adifpy
│    ├── differentiate
│    │   └── dual_number.py
│    │   └── evaluator.py  
│    │   └── forward_mode.py
│    │   └── function_tree.py  
│    │   └── reverse_mode.py
│    ├── visualize
│    │   └── graph_function.py  
│    ├── test 
│    │   └── README.md
│    │   └── run_tests.sh
│    │   └── test_dual_number.py
│    │   └── ... (unit and integration tests)
│    ├── __init__.py
│    └── config.py
└── .github
    └── workflows
      └── coverage.yaml
      └── test.yaml
 

Subpackages

The Adifpy directory contains the source code of our package, which contains 3 subpackages: differentiate, visualize, and test, described below.

Differentiate

The differentiate subpackage currently contains modules required to perform forward mode AD on functions from R to R. Contained in this subpackage are the modules dual_number.py, elementary_functions.py, evaluator.py, forward_mode.py, function_tree.py, and reverse_mode.py. For more information on each module, see Modules and Classes.

Visualize

This subpackage has not been implemented yet. Check out our implementation plan below.

Test

The test suite is contained in the test sub-package, as shown above in the Directory Structure. The test directory contains a run_tests.sh, which installs the package and runs the relevant pytest commands to display data on the testing suite (similar to the CI workflows).

The individual test files, each of which are named in the test_*.py format, test a different aspect of the package. Within each file, each function (also named test_*) tests a smaller detail of that aspect. For example, the test_dual_number.py test module tests the implementation of the DualNumber class. Each function in that module tests one of the overloaded operators. Thus, error messaging will be comprehensive, should one of these operators be changed and fail to work.

The easiest way to run the test suite is to go to the test directory and run ./run_tests.sh.

Implementation

Major data structures, including descriptions on how dual numbers are implemented, are described in the Modules and Classes section below.

Libraries

The differentiate sub-package requires the NumPy library. Additionally, the visualization sub-package will require MatPlotLib for displaying graphs. Additional libraries may be required later for additional ease of computation or visualization.

These requirements are specified in the requirements.txt for easy installation.

Modules and Classes

dual_number.py

the DualNumber class, stored in this module, contains the functionality for dual numbers for automatic differentiation. When a forward pass (in forward mode) is performed on a user function, a DualNumber object is passed to mimic the function's numeric or vector input. All of DualNumber's major mathematical dunder methods are overloaded so that the DualNumber is updated for each of the function's elementary operations.

Each of the binary dunder methods (addition, division, etc.) work with both other numeric types (integers and floats) as well as other DualNumbers.

evaluator.py

The Evaluator class, stored in this module, is the user's main communication with the package. An Evaluator object is defined by its function, which is provided by the user on creation. A derivative can be calculated at any point, with any seed vector, by calling an Evaluator's eval function. The Evaluator class ensures that a user's function is valid, decides whether to use forward or reverse mode (based on performance), and returns the derivative on eval calls.

When reverse mode is implemented, the Evaluator class may also contain optimizations for making future eval calls faster by storing a computational graph.

forward_mode.py

This module contains only the forward_mode method, which takes a user function, evaluation point, and seed vector. Its implementation is incredibly simple: a DualNumber is created with the real part as the evaluation point and the dual part as the seed vector. This DualNumber is then passed through the user's function, and the resulting real and dual components of the output DualNumber are the function output and derivative.

function_tree.py

The FunctionTree class, stored in this module, is a representation of a computational graph in the form of a tree, where intermediate variables are stored as nodes. The parent-child relationship between these nodes represents the elementary operations for these intermediate variables. This class contains optimizations like ensuring duplicate nodes are avoided.

This module is currently unused (and un-implemented). When reverse mode is implemented, a given Evaluator object will build up and store a FunctionTree for optimization.

reverse_mode.py

This module contains only the reverse_mode method, which takes the same arguments as forward_pass. This function is not yet implemented.

graph_tree.py

This module will contain functionality for displaying a presentable representation of a computation graph in an image. Using a FunctionTree object, the resulting image will be a tree-like structure with nodes and connections representing intermediate variables and elementary operations. This functionality is not yet implemented.

graph_function.py

This module will contain functionality for graphing a function and its derivative. It will create an Evaluator object and make the necessary eval calls to fill a graph for display. This functionality is not yet implemented.

Elementary Functions

Many elementary functions like trigonometric, inverse trigonometric and exponential cannot be overloaded by Python's dunder methods (like addition and subtraction can). However, a user must still be able to use these operators in their functions, but cannot use the standard math or np versions, since a DualNumber object is passed to the function for forward passes.

Thus, we define a module elementary_functions.py that contains methods which take a DualNumber, and return a DualNumber, with the real part equal to the elementary operation applied to the real part, and the derivative of the operation applied to the dual part. Thus, these functions are essentially our package's storage for the common derivatives (cosine is the derivative of sine, etc.), where the storage of the derivative is the assignment of the dual part of the output of these elementary operations.

These operations will be automatically imported in the package's __init__.py so that users can simply call Adifpy.sin() or Adifpy.cos() (for this milestone our implementation requires users to call ef.sin() and ef.cos(), not Adifpy.sin() or Adifpy.cos()), as they would with np.sin() and np.cos().

Extension

Now that our forward mode implementation is complete, we will move on to implement additional features and conveniences for the user.

Reverse Mode

We will implement reverse mode AD in the differentiate subpackage. Given that we have already been quizzed on the background math, encoding this process should not be too onerous. One of the biggest challenges that we foresee is determining when it is best to use Reverse Mode and when it is best to use Forward Mode. Obviously, it is better to use forward mode when there are far more outputs than inputs and vice-versa for reverse mode, but in the cases where number of inputs and outputs are similar it is not so simple. To address this we will do a series of practical tests on functions of different dimensions, and manually encode the most efficient results into evaluator.py.

Visualization

We are planning on creating a visualization tool with MatPlotLib that can plot the computational graph (calculated in Reverse Mode) of simple functions that are being differentiated. Obviously, the computational graph of very complex functions with many different inputs and outputs can be impractical to represent on a screen, so one of the biggest challenges that we will face is to have our program able determine when it can produce a visual tool that can be easily rendered, and when it cannot.

Impact

Future

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adifpy-0.0.3.tar.gz (2.0 MB view details)

Uploaded Source

Built Distribution

adifpy-0.0.3-py3-none-any.whl (105.4 kB view details)

Uploaded Python 3

File details

Details for the file adifpy-0.0.3.tar.gz.

File metadata

  • Download URL: adifpy-0.0.3.tar.gz
  • Upload date:
  • Size: 2.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.4

File hashes

Hashes for adifpy-0.0.3.tar.gz
Algorithm Hash digest
SHA256 66852c7d4902b826ac46b1a76b342360d2f8e8195de0920c61f2056f082ae666
MD5 34c31b808653f2f2956592c9f203e3c5
BLAKE2b-256 faa14cdcada4cb200dc3d2f57926801180477667c24f33311b8b246a2f74df4f

See more details on using hashes here.

File details

Details for the file adifpy-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: adifpy-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 105.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.4

File hashes

Hashes for adifpy-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b49bba56acbeadbdb5210164a6824889428167178417b29e26af0cd62df3f2d2
MD5 673782dffa593932ee697bdd3e76e6df
BLAKE2b-256 195af4bc18f17dc52a4d3c4ad14b2d5f366332e6054d42e8d5a0853e21980529

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page