Skip to main content

No project description provided

Project description

TDAvec.py

TDAvec.py is a python interface to TDAvec R package, which is available on CRAN

First of all, it allows access to all implemented in the original R package vectorizations functions:

  • computeAlgebraicFunctions: Compute Algebraic Functions from a Persistence Diagram
  • computeBettiCurve: A Vector Summary of the Betti Curve
  • computeComplexPolynomial: Compute Complex Polynomial Coefficients from a Persistence Diagram
  • computeEulerCharacteristic: A Vector Summary of the Euler Characteristic Curve
  • computeNormalizedLife: A Vector Summary of the Normalized Life Curve
  • computePersistenceBlock: A Vector Summary of the Persistence Block
  • computePersistenceImage: A Vector Summary of the Persistence Surface
  • computePersistenceLandscape: Vector Summaries of the Persistence Landscape Functions
  • computePersistenceSilhouette: A Vector Summary of the Persistence Silhouette Function
  • computePersistentEntropy: A Vector Summary of the Persistent Entropy Summary Function
  • computeStats: Compute Descriptive Statistics for Births, Deaths, Midpoints, and Lifespans in a Persistence Diagram
  • computeTemplateFunction: Compute a Vectorization of a Persistence Diagram based on Tent Template Functions
  • computeTropicalCoordinates: Compute Tropical Coordinates from a Persistence Diagram

All these functions can easily be called using tdavec.tdavec_core package.

In addition, we provide also sklearn-type interface to the same functionality, which could be more familiar for python programmers.

Note that the package was tested only on python 3.12 and requires Microsoft Build Tools for installation.

Setup

TDAvec.py is available on pypi. To install it simply type

pip install tdavec

into your environment.

If a wheel is not available for your system, pip will build from source. This may require a C/C++ compiler:

  • Windows: Microsoft Build Tools
  • macOS: Xcode Command Line Tools (xcode-select --install)
  • Linux: gcc/clang and Python development headers (python3-dev)

In order to check if the installation process was completed, you can run python and evaluate the following lines:

> from tdavec import test_package
> X, D, PS = test_package()

This function will create a simple point cloud, build a persistence diagram, caclulate the Persistence Silhouette vectorization from it, and return these three objects.

Usage

In this section, some simple examples of package usage are demonstrated.

We will start with loading TDAvec library and some other packages:

from tdavec import createEllipse, TDAvectorizer, tdavec_core
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import pandas as pd
import numpy as np

As a sample data we will work with set of point clouds, that represent deformed elipses with randomly selected squeeze ratios:

np.random.seed(42)
epsList = np.random.uniform(low = 0, high = 1, size = 500)
clouds = [createEllipse(a=1, b=eps, n=100) for eps in epsList]

Here are some examples:

for i, cl in enumerate(clouds[:4]):
    plt.subplot(2, 2, i+1)
    plt.plot(cl[:,0], cl[:,1], ".")
    plt.xlim(-1.5, 1.5); plt.ylim(-1.5, 1.5)
    plt.title(f"eps={np.round(epsList[i], 2)}")
    plt.grid()
plt.tight_layout()

Point clouds

In order to generate Persistence Diagrams one needs to create TDAvectorizer object and fit it:

v = TDAvectorizer()
v.setParams({"scale":np.linspace(0, 2, 10)})
v.fit(clouds)

Here are the examples of the generated persistence diagram:

for i in range(4):
    plt.subplot(2,2,i+1)
    PD = v.diags[i]
    for dim in range(2):
        plt.plot(PD[dim][:,0], PD[dim][:,1], ".")
        plt.xlim(0, 2); plt.ylim(0, 2)
        plt.axline( (0,0), slope = 1, linestyle = "--", linewidth = 0.5)
        plt.title(f"eps={np.round(epsList[i], 2)}")
plt.tight_layout()

PDs

Once TDAvectorizer object is fitted, one can calculate vectorization by calling transorm() method of this object:

X = v.transform(output="PS", homDim=1)
for i, e in enumerate(epsList[:4]):
    plt.plot(v.getParams()["scale"][1:],X[i,:], label=np.round(e, 3))
    plt.xlim(0, 2)
plt.legend()
plt.show()

Vectorizations

These vectorizations can be used as predictors for ML problem, whose goal is to predict the original deformation parameter. We will use a simple sklearn.LinearRegression model to solve the problem.

Here is a simple function, that for any given set of predictors creates the model, solves it, and returns the results:

def makeSim(X, y=epsList):
    Xtrain, Xtest, ytrain, ytest = train_test_split(X, y, train_size=0.8, random_state=42)
    model = LinearRegression().fit(Xtrain, ytrain)
    test_preds = model.predict(Xtest)
    score = model.score(Xtest, ytest)
    res = {"method":method, "homDim":homDim, "test_preds":test_preds, "y_test":ytest, "score":score}
    return res

In the loop below a systematic scan over different vectorization methods and homological dimensions is performed:

v.setParams({"scale":np.linspace(0, 2, 30)})
methodList = v.vectorization_names
results = []
df = pd.DataFrame()
for homDim in [0, 1]:
    print(f" Dimension {homDim}: ", end=" ")
    for method in methodList[:-2]:
        print(method, end = " ")
        X =v.transform(output=method, homDim=homDim)
        res = makeSim(X); results.append(res)
        df = pd.concat([df, pd.DataFrame(res)])
    print()

Here is the table of calculated accuracies:

method/dimension 0 1
ecc 0.976 0.996
vab 0.976 0.986
fda 0.983 0.985
nl 0.96 0.981
poly 0.967 0.975
algebra 0.971 0.955
ps 0.946 0.914
stats 0.987 0.887
pes 0.989 0.717
pi 0.986 0.547

As you can see, majority of them are very close to 1, which means that the models are pretty accurate. Presented below truth/predictions scatter plots confirm this conslusion:

Comparison

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mac_tdavec-0.9.1.tar.gz (270.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mac_tdavec-0.9.1-cp311-cp311-macosx_11_0_arm64.whl (438.0 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file mac_tdavec-0.9.1.tar.gz.

File metadata

  • Download URL: mac_tdavec-0.9.1.tar.gz
  • Upload date:
  • Size: 270.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for mac_tdavec-0.9.1.tar.gz
Algorithm Hash digest
SHA256 faef8e0b78ce68e441bd497d54e33d0d62d06cbc0454d3da4e8b7ff6935adb62
MD5 4463458b7b558be22a15a2cebc7a041a
BLAKE2b-256 53a911b47ebabe5619e29ea69fa7794e3381a8a83f916fd9fd5a514a07c13910

See more details on using hashes here.

File details

Details for the file mac_tdavec-0.9.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for mac_tdavec-0.9.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 de749f3874b46d56d024cefa2dd82f81da875725cc13d4205335e2c802a43a02
MD5 ccc9d2358d249f3900b3799469f2897a
BLAKE2b-256 a3eb28e3fab00b74c3cc1b1a29ca7bd4a0b1d05d365baca3ca4ac5fff4588194

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page