Python implementation of Gowers distance, pairwise between records in two data sets
Project description
Introduction
Gower's distance calculation in Python. Gower Distance is a distance measure that can be used to calculate distance between two entity whose attribute has a mixed of categorical and numerical values. Gower (1971) A general coefficient of similarity and some of its properties. Biometrics 27 857–874.
Examples
Installation
pip install gower
Generate some data
import numpy as np
import pandas as pd
import gower
Xd=pd.DataFrame({'age':[21,21,19, 30,21,21,19,30,None],
'gender':['M','M','N','M','F','F','F','F',None],
'civil_status':['MARRIED','SINGLE','SINGLE','SINGLE','MARRIED','SINGLE','WIDOW','DIVORCED',None],
'salary':[3000.0,1200.0 ,32000.0,1800.0 ,2900.0 ,1100.0 ,10000.0,1500.0,None],
'has_children':[1,0,1,1,1,0,0,1,None],
'available_credit':[2200,100,22000,1100,2000,100,6000,2200,None]})
Yd = Xd.iloc[1:3,:]
X = np.asarray(Xd)
Y = np.asarray(Yd)
Find the distance matrix
gower.gower_matrix(X)
array([[0. , 0.3590238 , 0.6707398 , 0.31787416, 0.16872811,
0.52622986, 0.59697855, 0.47778758, nan],
[0.3590238 , 0. , 0.6964303 , 0.3138769 , 0.523629 ,
0.16720603, 0.45600235, 0.6539635 , nan],
[0.6707398 , 0.6964303 , 0. , 0.6552807 , 0.6728013 ,
0.6969697 , 0.740428 , 0.8151941 , nan],
[0.31787416, 0.3138769 , 0.6552807 , 0. , 0.4824794 ,
0.48108295, 0.74818605, 0.34332284, nan],
[0.16872811, 0.523629 , 0.6728013 , 0.4824794 , 0. ,
0.35750175, 0.43237334, 0.3121036 , nan],
[0.52622986, 0.16720603, 0.6969697 , 0.48108295, 0.35750175,
0. , 0.2898751 , 0.4878362 , nan],
[0.59697855, 0.45600235, 0.740428 , 0.74818605, 0.43237334,
0.2898751 , 0. , 0.57476616, nan],
[0.47778758, 0.6539635 , 0.8151941 , 0.34332284, 0.3121036 ,
0.4878362 , 0.57476616, 0. , nan],
[ nan, nan, nan, nan, nan,
nan, nan, nan, nan]], dtype=float32)
Find Top n results
gower.gower_topn(Xd.iloc[0:2,:], Xd.iloc[:,], n = 5)
{'index': array([4, 3, 1, 7, 5]),
'values': array([0.16872811, 0.31787416, 0.3590238 , 0.47778758, 0.52622986],
dtype=float32)}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gower-0.0.5.tar.gz
(4.4 kB
view hashes)