Quantify the difference between two arbitrary curves in space

## similaritymeasures    ## Quantify the difference between two arbitrary curves

Curves in this case are: - discretized by inidviudal data points - ordered from a beginning to an ending

Consider the following two curves. We want to quantify how different the Numerical curve is from the Experimental curve. Notice how there are no concurrent Stress or Strain values in the two curves. Additionally one curve has more data points than the other curves.

In the ideal case the Numerical curve would match the Experimental curve exactly. This means that the two curves would appear directly on top of each other. Our measures of similarity would return a zero distance between two curves that were on top of each other.

## Methods covered

This library includes the following methods to quantify the difference (or similarity) between two curves:

• Partial Curve Mappingx (PCM) method: Matches the area of a subset between the two curves 

• Area methodx: An algorithm for calculating the Area between two curves in 2D space 

• Discrete Frechet distancey: The shortest distance in-between two curves, where you are allowed to very the speed at which you travel along each curve independently (walking dog problem) [3, 4, 5, 6, 7, 8]

• Curve Lengthx method: Assumes that the only true independent variable of the curves is the arc-length distance along the curve from the origin [9, 10]

• Dynamic Time Warpingy (DTW): A non-metric distance between two time-series curves that has been proven useful for a variety of applications [11, 12, 13, 14, 15, 16]

• Mean absolute errory,z (MAE): A L1 error that requires curves to have the same number of data points and dimensions. See this wiki page.

• Mean squared errory,z (MSE): A L2 error that requires curves to have the same number of data points and dimensions. See this wiki page.

x denotes methods created specifically for material parameter identification

y denotes that the method implemented in this library supports N-D data!

z denotes that the method requires each curve to have the same number of data points

## Installation

Install with pip

python -m pip install similaritymeasures

or clone and install from source.

git clone https://github.com/cjekel/similarity_measures
python -m pip install ./similarity_measures

## Example usage

This shows you how to compute the various similarity measures

import numpy as np
import similaritymeasures
import matplotlib.pyplot as plt

# Generate random experimental data
x = np.random.random(100)
y = np.random.random(100)
exp_data = np.zeros((100, 2))
exp_data[:, 0] = x
exp_data[:, 1] = y

# Generate random numerical data
x = np.random.random(100)
y = np.random.random(100)
num_data = np.zeros((100, 2))
num_data[:, 0] = x
num_data[:, 1] = y

# quantify the difference between the two curves using PCM
pcm = similaritymeasures.pcm(exp_data, num_data)

# quantify the difference between the two curves using
# Discrete Frechet distance
df = similaritymeasures.frechet_dist(exp_data, num_data)

# quantify the difference between the two curves using
# area between two curves
area = similaritymeasures.area_between_two_curves(exp_data, num_data)

# quantify the difference between the two curves using
# Curve Length based similarity measure
cl = similaritymeasures.curve_length_measure(exp_data, num_data)

# quantify the difference between the two curves using
# Dynamic Time Warping distance
dtw, d = similaritymeasures.dtw(exp_data, num_data)

# mean absolute error
mae = similaritymeasures.mae(exp_data, num_data)

# mean squared error
mse = similaritymeasures.mse(exp_data, num_data)

# print the results
print(pcm, df, area, cl, dtw, mae, mse)

# plot the data
plt.figure()
plt.plot(exp_data[:, 0], exp_data[:, 1])
plt.plot(num_data[:, 0], num_data[:, 1])
plt.show()

If you are interested in setting up an optimization problem using these measures, check out this Jupyter Notebook which replicates Section 3.2 from .

## Changelog

Version 0.3.0: Frechet distance now supports N-D data! See CHANGELOG.md for full details.

## Documenation

Each function includes a descriptive docstring, which you can view online here.

## Contributions welcome!

This is by no means a complete list of all possible similarity measures. For instance the SciPy Hausdorff distance is an alternative similarity measure useful if you don’t know the beginning and ending of each curve. There are many more possible functions out there. Feel free to send PRs for other functions in literature!

Requirements for adding new method to this library: - all methods should be able to quantify the difference between two curves - method must support the case where each curve may have a different number of data points - follow the style of existing functions - reference to method details, or descriptive docstring of the method - include test(s) for your new method - minimum Python dependencies (try to stick to SciPy/numpy functions if possible)

If you’ve found this information or library helpful please cite the following paper. You should also cite the papers of any methods that you have used.

Jekel, C. F., Venter, G., Venter, M. P., Stander, N., & Haftka, R. T. (2018). Similarity measures for identifying material parameters from hysteresis loops using inverse analysis. International Journal of Material Forming. https://doi.org/10.1007/s12289-018-1421-8

@article{Jekel2019,
author = {Jekel, Charles F and Venter, Gerhard and Venter, Martin P and Stander, Nielen and Haftka, Raphael T},
doi = {10.1007/s12289-018-1421-8},
issn = {1960-6214},
journal = {International Journal of Material Forming},
month = {may},
title = {{Similarity measures for identifying material parameters from hysteresis loops using inverse analysis}},
url = {https://doi.org/10.1007/s12289-018-1421-8},
year = {2019}
}

## Project details

This version 0.7.0 0.6.0 0.5.0 0.4.4 0.4.3 0.4.2 0.4.1 0.4.0 0.3.4 0.3.3 0.3.2 0.3.1 0.3.0 0.2.3 0.2.2 0.2.1 0.2.0 0.1.2 0.1.1 0.1.0

Uploaded source