Simple implementation of the papar Adaptive Feature-Space Conformal Transformation for Imbalanced-Data Learning

These details have not been verified by PyPI

Project links

Homepage

Project description

Adaptive Feature-Space Conformal Transformation for Imbalanced-Data Learning for Kernel SVM

A simple implementation of Adaptive Feature-Space Conformal Transformationfor any kernel SVM proposed in [2]

Core concept

In [1], the idea of inceasing the separability (margin) of classes for kernel SVM, can be achieved by magnify the spatial resolution in the region around the boundary in the hyperspace, the non-linear serface where $\textbf{x}$ has been already mapped to $\phi(\textbf{x})$ using kernel tricks.

Given $K(\textbf{x}, \textbf{x}') = \langle \phi(\textbf{x}), \phi(\textbf{x}')\rangle$ is a kernel function corresponding to the non-linear mapping function $\phi(\textbf{x})$ for a kernel SVM. We can employ a conformal transformation of kernel:

$$ \begin{aligned} \tilde{K}(\textbf{x}, \textbf{x}') = D(\textbf{x})D(\textbf{x}')K(\textbf{x}, \textbf{x}') \end{aligned} $$

to enlarge the distance near the boundary in non-linear surface in higher dimension by choosing appropriate $D(\textbf{x})$ that has a high value when $\textbf{x}$ is close to the boundary and small when it is far away from the boundary.

In [1], it has been suggested that $D(\textbf{x})$ can be chosen as:

$$ \begin{aligned} D(\textbf{x}) = \sum_{\textbf{x}_k \epsilon SV} e^{ -\frac{1}{\tau_k^{2}}||\textbf{x} - \textbf{x}_k||^{2}} \end{aligned} $$

where

$$ \begin{aligned} \tau_k^{2} = \frac{1}{M} \sum_{\textbf{x}_s \epsilon SV_k} || \textbf{x}_s - \textbf{x}_k||^{2} || \end{aligned} $$

where $SV_{k}$ denotes a set of $M$ support vectors $\textbf{x}_{s}$ that are nearest to support vector $\textbf{x}_k$

In this implementation, we use the $\tau_k^{2}$ proposed in [2]:

$$ \begin{aligned} \tau_k^{2} = AVG_{\textbf{x}_s \epsilon {\textbf{x}_s \epsilon SV | \ || \phi{(\textbf{x}_s)} - \phi{(\textbf{x}_k)} ||^2 < M, \ y_s \ne y_k, }} (|| \phi{(\textbf{x}_s)} - \phi{(\textbf{x}_k)} ||^2) \end{aligned} $$

where $M$ is the mean distance squared (in hyperspace) of the nearest and the farthest support vector from $\phi(\textbf{x}_k)$. It has been mentioned in [2] that by setting $\tau_k$ like this will take into account the spatial distribution of the support vectors in hyperspace.

Note that given kernel function, $K(\textbf{x}, \textbf{x}')$ is known but the mapping function $\phi(\textbf{x})$ is unknown, we can still calculate the distance in hyperspace using Kernel trick

$$ \begin{aligned} | \phi{(\textbf{x}_s)} - \phi{(\textbf{x}_k)} ||^2 = K(\textbf{x}_s, \textbf{x}_s) + K(\textbf{x}_k, \textbf{x}_k) - 2K(\textbf{x}_s, \textbf{x}_k) \end{aligned} $$

Dealing with Imbalaned
It has been stated in [2] that the $\tau_k^{2}$ will be scaled with a larger factor $\eta_p$ if $\textbf{x}_k$ belong to minority class and will be scaled down with a factor $\eta_n$ to address imbalance issue. The paper suggest that we should choose $\eta_p$ and $\eta_n$ proportional to the skew of support vectors. In this version of implementation, I set $\eta_p$ = 1 and $\eta_n = \frac{|SV^{+}|}{|SV^{-}|}$ for now.

Example Usage

installation

pip install afc-svm-imbalanced-learning

Usage

from afc_imbalanced_learning import AFSCTSvm

afc_svm = AFSCTSvm()
afc_svm.fit(X_train, y_train)
y_pred = afc_svm.predict(X_test)

what .fit(X_train, y_train) does is it train kernel svm with laplacian kernel $K(\textbf{x}, \textbf{x}') = e^{-\gamma|\textbf{x} - \textbf{x}'|}$ as used in [2], then it estimates the location of boundary by extracting support vectors and it then calculate $\tau_k$ for every support vectors and then calculate $D(\textbf{x})$ to use for conformal transformation where we'll obtain $\tilde{K}(\textbf{x}, \textbf{x})$ to train our new SVM. We'll then use the new improved Kernel SVM to predict when .predict is called.

Custom initial kernel
you can use your own custom kernel function by parse it to kernel parameter

from sklearn.metrics.pairwise import rbf_kernel
afc_svm = AFSCTSvm(kernel=rbf_kernel)
afc_svm.fit(X_train, y_train)

Note that in this implementation, cost-sensitive svm will be used by default

References

[1] Wu, Si and Shun‐ichi Amari. “Conformal Transformation of Kernel Functions: A Data-Dependent Way to Improve Support Vector Machine Classifiers.” Neural Processing Letters 15 (2002): 59-67.
[2] Wu, Gang & Chang, Edward. (2003). Adaptive Feature-Space Conformal Transformation for Imbalanced-Data Learning. Proceedings, Twentieth International Conference on Machine Learning. 2. 816-823.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.10.0

Aug 6, 2024

0.9.0

Aug 5, 2024

0.8.1

Aug 2, 2024

0.8.0

Jul 13, 2024

0.7.0

Jul 7, 2024

This version

0.6.0

Jul 6, 2024

0.5.0

Jul 6, 2024

0.4.0

Jul 4, 2024

0.3.0

Jun 17, 2024

0.2.0

Jun 16, 2024

0.1.3

Jun 15, 2024

0.1.2

Jun 15, 2024

0.1.1

Jun 15, 2024

0.1.0

Jun 14, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

afc_svm_imbalanced_learning-0.6.0.tar.gz (6.3 kB view details)

Uploaded Jul 6, 2024 Source

Built Distribution

afc_svm_imbalanced_learning-0.6.0-py3-none-any.whl (7.5 kB view details)

Uploaded Jul 6, 2024 Python 3

File details

Details for the file afc_svm_imbalanced_learning-0.6.0.tar.gz.

File metadata

Download URL: afc_svm_imbalanced_learning-0.6.0.tar.gz
Upload date: Jul 6, 2024
Size: 6.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.0 CPython/3.11.5

File hashes

Hashes for afc_svm_imbalanced_learning-0.6.0.tar.gz
Algorithm	Hash digest
SHA256	`2dfe8483d2e1bc7817ab0032540e8b8ae30f9c305f05064c98e6ba2095a15bea`
MD5	`7a46e5d80f7fad69e3d12bd112b0120f`
BLAKE2b-256	`85c0fd331faa3423c305cfff241870782bae5e6229c8a8c62cd5d615630d0910`

See more details on using hashes here.

File details

Details for the file afc_svm_imbalanced_learning-0.6.0-py3-none-any.whl.

File metadata

Download URL: afc_svm_imbalanced_learning-0.6.0-py3-none-any.whl
Upload date: Jul 6, 2024
Size: 7.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.0 CPython/3.11.5

File hashes

Hashes for afc_svm_imbalanced_learning-0.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`43c68bf33a4e4ad305c07ce29168a555fdcf244b7be76dca0f8f494c6fbc6346`
MD5	`5b5cb15613df95fc8a74b8b6481b2be1`
BLAKE2b-256	`d561ba6d7779c036c776e7c475ec95fdba8092f871eb2329f6971aa10fb3145c`