A toolkit for estimating the correlation between variables
Project description
Correlation Kit
A toolkit for estimating the correlation values between variables
Installation
pip install correlation-kit
Correlation between two continual variables
import pandas as pd
from correlation_kit.ck_wrapper import CorrelationKit
# set a dataframe or read from a csv file
d = {'x': [1, 2, 3.5, 4], 'y': [3, 4, 4.5, 6]}
df = pd.DataFrame(data=d)
# set x label and y label for correlation
x = "x"
y = "y"
# calc
def get_correlation(x, y, corr_type):
stat = 0
p = 0
if corr_type == "pearson":
stat, p = CorrelationKit(df).get_pearson(x, y)
elif corr_type == "spearman":
stat, p = CorrelationKit(df).get_spearman(x, y)
elif corr_type == "kendalltau":
stat, p = CorrelationKit(df).get_kendalltau(x, y)
return stat, p
# print results
print("pearson = ", get_correlation(x, y, "pearson"))
print("spearman = ", get_correlation(x, y, "spearman"))
print("kendalltau = ", get_correlation(x, y, "kendalltau"))
Estimate correlation between binary and continual variables
import pandas as pd
from correlation_kit.ck_wrapper import CorrelationKit
# set a dataframe or read from a csv file
d = {'x': ['large', 'large', 'small', 'small'], 'y': ['hot', 'hot', 'cold', 'cold'],'z':[0,1,2.5,3]}
df = pd.DataFrame(data=d)
# set x label and y label for correlation, which is suitable for binary variables
r_p,r_s,r_k=CorrelationKit(df).get_corr_between_category_and_continual('x','large','z') # large=1; otherewise 0
# results
print('pearson: ',r_p)
print('speraman: ',r_s)
print('kendalltau: ',r_k)
Estimate F value between multiple-category variable and continual variables
import pandas as pd
from ck_wrapper import CorrelationKit
# set a dataframe or read from a csv file
d = {'x': ['large', 'large', 'middle','small', 'small'], 'y': ['hot', 'hot','warm', 'cold', 'cold'],'z':[0,1,2,2.5,3]}
df = pd.DataFrame(data=d)
# set x label and y label for correlation, which is suitable for multiple-category variables
F,p=CorrelationKit(df).get_f_oneway('x',['large','middle','small'],'z')
# results
print('F: ',F)
print('p: ',p)
License
The Correlation-Kit
project is provided by Donghua Chen.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file correlation-kit-1.0.0.dev2.tar.gz
.
File metadata
- Download URL: correlation-kit-1.0.0.dev2.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.21.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fd4a160d1a66562f4aad8cead59af64fc60662f43a817079b883406d7346e013 |
|
MD5 | 111eff214d2e95eb37da221a956c71c2 |
|
BLAKE2b-256 | 22e80cdf4358c79c9cd2f1ad51838cb0534dda517689c350042d7503eb05870b |