Data Science Library

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

polar

polar is a Python module that contains simple to use data science functions. It is built on top of SciPy, scikit-learn, seaborn and pandas.

Installation

If you already have a working installation of numpy and scipy, the easiest way to install parkitny is using pip:

pip install polar seaborn pandas scikit-learn scipy matplotlib numpy nltk -U

Dependencies

polar requires:

Python (>= 3.5)
NumPy (>= 1.11.0)
SciPy (>= 0.17.0)
Seaborn (>= 0.9.0)
scikit-learn (>= 0.21.3)
nltk (>= 3.4.5)
python-pptx (>= 0.6.18)
cryptography (> 2.8)

Jupyter Notebook Examples

Here is the link to the jupyter notebook with all the exmples that are described below Polar-Examples

ACA (Automated Cohort Analysis) Example

The ACA creates three heatmaps for each feature in the data set.

Conversion heatmap - conversion per feature value
Distribution heatmap - distribution per feature value
Size heatmap - total samples per feature value

Data File: ACA_date.csv

Final Result Power Point: ACA.pptx

import pandas as pd
import polar as pl
from pptx import Presentation
%matplotlib inline

url = "https://raw.githubusercontent.com/pparkitn/imagehost/master/ACA_date.csv"
data_df=pd.read_csv(url)

prs = Presentation()    
pl.create_title(prs,'ACA')
for chart in pl.ACA_create_graphs(data_df,'date','label'):
    pl.add_chart_slide(prs,chart[0],chart[1])
pl.save_presentation(prs,filename = 'ACA')

Conversion:

Distribution:

Samples:

EDA Example

import pandas as pd
import openml
import polar as pl

dataset = openml.datasets.get_dataset(31)
X, y, categorical_indicator, attribute_names = \
dataset.get_data(target=dataset.default_target_attribute,dataset_format='dataframe')

openml_df = pd.DataFrame(X)
openml_df['target'] = y

data_df = pl.analyze_correlation(openml_df,'target')
pl.get_heatmap(data_df,'correlation_heat_map.png',1.1,14,'0.1f',0,100,5,5)

data_df = pl.analyze_association(openml_df,'target',verbose=0)
pl.get_heatmap(data_df,'association_heat_map.png',1.1,12,'0.1f',0,100,10,10)

print(pl.analyze_df(openml_df, 'target',10))

data_df = pl.get_important_features(openml_df,'target')
pl.get_bar(data_df,'bar.png','Importance','Feature_Name')

NLP Example

import nltk
nltk.download('wordnet')
import pandas as pd
import polar as pl
from cryptography.fernet import Fernet

url = "https://raw.githubusercontent.com/pparkitn/imagehost/master/test_real_or_not_from_kaggle.csv"
data_df=pd.read_csv(url)

data_df.drop(columns=['id','keyword','location'], inplace=True)
data_df.head(3)

key = Fernet.generate_key()
data_df['text_encrypted'] =  data_df['text'].apply(pl.encrypt_df,args=(key,))
data_df['text_decrypted'] =  data_df['text_encrypted'].apply(pl.decrypt_df,args=(key,))

data_df['text_stem'] = data_df['text_decrypted'].apply(pl.nlp_text_process,args=('stem',))
data_df['text_stem_lem'] = data_df['text_stem'].apply(pl.nlp_text_process,args=('lem',))

data_df.head(3)

cluster_df = pl.nlp_cluster(data_df, 'text_stem_lem',  10, 'text_cluster',1.0,1,100,1,'KMeans',(1,2))[0]
cluster_df.groupby(['text_cluster']).count()

cluster_df[cluster_df['text_cluster']==9]['text_stem_lem']

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.0.127

Feb 10, 2024

0.0.126

Feb 10, 2024

0.0.125

Feb 9, 2024

0.0.124

Dec 6, 2023

0.0.123

Dec 6, 2023

0.0.122

Dec 6, 2023

0.0.121

Dec 6, 2023

0.0.120

Dec 6, 2023

0.0.119

Dec 5, 2023

0.0.118

Feb 10, 2021

0.0.117

Sep 6, 2020

0.0.115

Jul 8, 2020

0.0.113

Jul 8, 2020

0.0.112

Jul 8, 2020

0.0.110

Jul 8, 2020

This version

0.0.109

Jun 4, 2020

0.0.106

May 12, 2020

0.0.105

May 12, 2020

0.0.104

May 12, 2020

0.0.103

May 6, 2020

0.0.101

May 5, 2020

0.0.99

May 5, 2020

0.0.98

Apr 23, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polar-0.0.109.tar.gz (11.7 kB view hashes)

Uploaded Jun 4, 2020 Source

Built Distribution

polar-0.0.109-py3-none-any.whl (10.0 kB view hashes)

Uploaded Jun 4, 2020 Python 3

Hashes for polar-0.0.109.tar.gz

Hashes for polar-0.0.109.tar.gz
Algorithm	Hash digest
SHA256	`d7ea6730243d0bb67cf51716cef0a4bdfc95e8947a49ba736425d347bf6a93b7`
MD5	`9a28246046ec381fe10de0aaadeb92d8`
BLAKE2b-256	`941a66560a96217388f5c9b1924159a7ef26f0b5434481d9ca7b591c16ff03a0`

Hashes for polar-0.0.109-py3-none-any.whl

Hashes for polar-0.0.109-py3-none-any.whl
Algorithm	Hash digest
SHA256	`237a49a789cbc2061945c9a4bd7ec164e12886e6105cc999f137fbf9f431c711`
MD5	`1479fefd0ed5a948e1f7e2af85cd4cf1`
BLAKE2b-256	`2a1611070f542ca424e04e54d8ce3983b1e673e6dfb254f89b4406caa8e62838`