Skip to main content

A Python library for doing curve matching with Fréchet distance and Procrustes analysis

Project description

Curve Matcher Python

ci PyPI license

A Python library for doing curve matching with Fréchet distance and Procrustes analysis, ported from the Javascript curve-matcher library.

Installation

Curve matcher can be installed via pip:

pip install curve-matcher

Getting started

The core of curve-matcher is a function called shape_similarity which estimates how similar the shapes of 2 curves are to each other, returning a value between 0 and 1.

shape_similarity example curves

Curves are defined as lists of points of x and y like below:

from curve_matcher import Point
curve = [Point(x=2, y=1.5), Point(x=4, y=3), ... ]

calculating similarity between 2 curves is as simple as calling:

from curve_matcher import shape_similarity

# 1 means identical shape, 0 means very different shapes
similarity = shape_similarity(curve1, curve2)

shape_similarity automatically adjusts for rotation, scale, and translation differences between so it doesn't matter if the curves are different sizes or in different locations on the screen - as long as they have the same shape the similarity score will be close to 1.

You can further customize the accuracy of the shape_similarity function by changing estimation_points (default 50) and rotations (default 10). Increasing these will improve accuracy, but the function will take longer to run.

# higher accuracy, but slower
shape_similarity(curve1, curve2, estimation_points=200, rotations=30)

# lower accuracy, but faster
shape_similarity(curve1, curve2, estimation_points=10, rotations=0)

You can also restrict the range of rotations that are checked using the restrict_rotation_angle option. This option means the shape_similarity function will only check rotations within +- restrict_rotation_angle radians. If you'd like to disable rotation correction entirely, you can set check_rotations=False. These are shown below:

# Only check rotations between -0.1 π to 0.1 π
shape_similarity(curve1, curve2, restrict_rotation_angle=0.1 * math.pi)

# disable rotation correction entirely
shape_similarity(curve1, curve2, check_rotations=False)

How it works

Internally, shape_similarity works by first normalizing the curves using Procrustes analysis and then calculating Fréchet distance between the curves.

Procrustes analysis attempts to translate both the curves to the origin and adjust their scale so they're the same size. Then, it rotates the curves so their rotations are as close as possible.

In practice, Procrustes analysis has 2 issues which curve-matcher works to address. First, it's very dependent on how the points of the curve are spaced apart from each other. To account for this, shape_similarity first redraws each curve using 50 (by default) points equally spaced out along the length of the curve. In addition, Procrustes analysis sometimes doesn't choose the best rotation if curves are not that similar to each other, so shape_similarity also tries 10 (by default) equally spaced rotations to make sure it picks the best possible rotation normalization. You can adjust these parameters via the estimation_points and rotations options to shape_similarity.

If you'd like to implement your own version of shape_similarity there's a number of helper methods that are exported by curve-matcher which you can use as well, discussed below:

Fréchet distance

Curve matcher includes an implemention of a discreet Fréchet distance algorithm from the paper Computing Discrete Fréchet Distance. You can use this function by passing in 2 curves, as below:

from curve_matcher import frechet_distance

dist = frechet_distance(curve1, curve2)

As with shape_similarity, curves are in the format [Point(x=2, y=1.5), Point(x=4, y=3), ... ].

A caveat of discreet Fréchet distance is that the calculation is only as accurate as the length of the line segments of the curves. That means, if curves have long distances between each of the points in the curve, or if there's not many points in the curve, the calculation may be inaccurate. To help alleviate this, Curve matcher provides a helper method called subdivide_curve which takes a curve and splits up line segments in the curve to improve the accuracy of the Fréchet distance calculation. This can be used as below:

from curve_matcher import frechet_distance, subdivide_curve

# subdivide the curves so each segment is at most length 0.5
dividedCurve1 = subdivide_curve(curve1, maxLen=0.5)
dividedCurve2 = subdivide_curve(curve2, maxLen=0.5)

# now, the frechet distance is guaranteed to be at most off by 0.5
dist = frechet_distance(dividedCurve1, dividedCurve2)

Procrustes analysis

Curve matcher also exports a few methods to help with Procrustes analysis. However, before running these it's recommended that curves be rebalanced so that the points of the curve are all equally spaced along its length. This can be done with a function called rebalance_curve as below:

from curve_matcher import rebalance_curve

# redraw the curve using 50 equally spaced points
balanced_curve = rebalance_curve(curve, numPoints=50)

Then, to normalize scale and translation, pass the curve into procrustes_normalize_curve as below:

from curve_matcher import procrustes_normalize_curve, rebalance_curve

balanced_curve = rebalance_curve(curve)
scaledAndTranslatedCurve = procrustes_normalize_curve(balanced_curve)

There's also a function provided called procrustes_normalize_rotation to help normalize rotation using Procrustes analysis. It should be noted that this may give odd results if the 2 curves don't have a relatively similar shape to each other. Make sure that the curves are already rebalanced and have scale and translation normalized before using this function. This function can be used as below:

from curve_matcher import (
  procrustes_normalize_curve,
  procrustes_normalize_rotation,
  rebalance_curve,
)

# first rebalance and normalize scale and translation of the curves
normalizedCurve1 = procrustes_normalize_curve(rebalance_curve(curve1))
normalizedCurve2 = procrustes_normalize_curve(rebalance_curve(curve2))

# rotate normalizedCurve1 to match normalizedCurve2
rotatedCurve1 = procrustes_normalize_rotation(
  normalizedCurve1,
  normalizedCurve2
)

You can read more about these algorithms here: https://en.wikipedia.org/wiki/Procrustes_analysis

License

Curve matcher is released under a MIT License.

Contributing

Contributions are welcome! These steps will guide you through contributing to this project:

  • Fork the repo
  • Clone it and install dependencies
    git clone https://github.com/90HH/curve-matcher-py
    pdm install
    

Make and commit your changes. This project uses PDM for dependency management, Ruff for linting / formatting, and Pyright for type checking. Make sure any changes pass linting and type-checking by running pdm run pyright and pdm run ruff . before committing.

Finally send a GitHub Pull Request with a clear list of what you've done. Make sure all of your commits are atomic (one feature per commit). Please add tests for any features that you add or change.

Sources

This library is a port of the Javascript curve-matcher library, which was extracted from stroke matching code in Hanzi Writer.

Procrustes analysis algorithms are from https://en.wikipedia.org/wiki/Procrustes_analysis

Discrete Fréchet distance algorithm is from Computing Discrete Fréchet Distance

Happy curve matching!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

curve_matcher-0.2.0.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

curve_matcher-0.2.0-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file curve_matcher-0.2.0.tar.gz.

File metadata

  • Download URL: curve_matcher-0.2.0.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for curve_matcher-0.2.0.tar.gz
Algorithm Hash digest
SHA256 8756f236ad1b66f2a5acfc6b1b3f67adc07ea280ecdb02d64caccc4c4019f308
MD5 a9e89b00cbda5ec3e09c0c006feee633
BLAKE2b-256 300c052424be4f0d7a534406cbb33ba3bc7b766792f07d3fe5c3c503288888ef

See more details on using hashes here.

File details

Details for the file curve_matcher-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for curve_matcher-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f33faf30d57993c0a866022cabbe7b1e1d0f634b6a42048d8a5bbfed093d1a44
MD5 97258d3dabf8ba8074c5ad4fea18e84a
BLAKE2b-256 4963564469dbb9dafdbf9c4555c370f22fe11b3b3e10314d7f8f9e178c7b6d30

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page