Skip to main content

Data Science Library

Project description

polar

polar is a Python module that contains simple to use data science functions. It is built on top of SciPy, scikit-learn, seaborn and pandas.

Installation

If you already have a working installation of numpy and scipy, the easiest way to install parkitny is using pip:

pip install polar seaborn pandas scikit-learn scipy matplotlib numpy nltk -U

Dependencies

polar requires:

  • Python (>= 3.5)
  • NumPy (>= 1.11.0)
  • SciPy (>= 0.17.0)
  • Seaborn (>= 0.9.0)
  • scikit-learn (>= 0.21.3)
  • nltk (>= 3.4.5)
  • python-pptx (>= 0.6.18)
  • cryptography (> 2.8)
  • imblearn

Jupyter Notebook Examples

Here is the link to the jupyter notebook with all the exmples that are described below Polar-Examples

ACA (Automated Cohort Analysis) Example

The ACA creates three heatmaps for each feature in the data set.

  • Conversion heatmap - conversion per feature value
  • Distribution heatmap - distribution per feature value
  • Size heatmap - total samples per feature value

Data File: ACA_date.csv

Final Result Power Point: ACA.pptx

import pandas as pd
import polar as pl
from pptx import Presentation
%matplotlib inline

url = "https://raw.githubusercontent.com/pparkitn/imagehost/master/ACA_date.csv"
data_df=pd.read_csv(url)

prs = Presentation()    
pl.create_title(prs,'ACA')
for chart in pl.ACA_create_graphs(data_df,'date','label'):
    pl.add_chart_slide(prs,chart[0],chart[1])
pl.save_presentation(prs,filename = 'ACA')

Conversion: Image

Distribution: Image

Samples: Image

EDA Example

import pandas as pd
import openml
import polar as pl

dataset = openml.datasets.get_dataset(31)
X, y, categorical_indicator, attribute_names = \
dataset.get_data(target=dataset.default_target_attribute,dataset_format='dataframe')

openml_df = pd.DataFrame(X)
openml_df['target'] = y

data_df = pl.analyze_correlation(openml_df,'target')
pl.get_heatmap(data_df,'correlation_heat_map.png',1.1,14,'0.1f',0,100,5,5)

Image

data_df = pl.analyze_association(openml_df,'target',verbose=0)
pl.get_heatmap(data_df,'association_heat_map.png',1.1,12,'0.1f',0,100,10,10)

Image

print(pl.analyze_df(openml_df, 'target',10))

Image

data_df = pl.get_important_features(openml_df,'target')
pl.get_bar(data_df,'bar.png','Importance','Feature_Name')

Image

NLP Example

import nltk
nltk.download('wordnet')
import pandas as pd
import polar as pl
from cryptography.fernet import Fernet

url = "https://raw.githubusercontent.com/pparkitn/imagehost/master/test_real_or_not_from_kaggle.csv"
data_df=pd.read_csv(url)

data_df.drop(columns=['id','keyword','location'], inplace=True)
data_df.head(3)

Image

key = Fernet.generate_key()
data_df['text_encrypted'] =  data_df['text'].apply(pl.encrypt_df,args=(key,))
data_df['text_decrypted'] =  data_df['text_encrypted'].apply(pl.decrypt_df,args=(key,))

data_df['text_stem'] = data_df['text_decrypted'].apply(pl.nlp_text_process,args=('stem',))
data_df['text_stem_lem'] = data_df['text_stem'].apply(pl.nlp_text_process,args=('lem',))

data_df.head(3)

Image

cluster_df = pl.nlp_cluster(data_df, 'text_stem_lem',  10, 'text_cluster',1.0,1,100,1,'KMeans',(1,2))[0]
cluster_df.groupby(['text_cluster']).count()

Image

cluster_df[cluster_df['text_cluster']==9]['text_stem_lem']

Image

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polar-0.0.118.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polar-0.0.118-py3-none-any.whl (10.1 kB view details)

Uploaded Python 3

File details

Details for the file polar-0.0.118.tar.gz.

File metadata

  • Download URL: polar-0.0.118.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.4.2 requests/2.22.0 setuptools/51.1.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.5

File hashes

Hashes for polar-0.0.118.tar.gz
Algorithm Hash digest
SHA256 6e00589e04b445550f4481b395743ac4a77c8ba98dda25309ff666f38a3f58ff
MD5 f08221ba161461ff3d3f7a86083c25c0
BLAKE2b-256 e35d3dd1b0d4dbdcb74cfe2ff2aa35829b84df5f87e922d18cef8bc2f6942322

See more details on using hashes here.

File details

Details for the file polar-0.0.118-py3-none-any.whl.

File metadata

  • Download URL: polar-0.0.118-py3-none-any.whl
  • Upload date:
  • Size: 10.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.4.2 requests/2.22.0 setuptools/51.1.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.5

File hashes

Hashes for polar-0.0.118-py3-none-any.whl
Algorithm Hash digest
SHA256 82e77e028240ae0808d8b85b76dfd1d8333ac01e736772bb2d1573cc52d9a79a
MD5 5f025622a9bb0204e212b16c96e14c9c
BLAKE2b-256 9b98609c29043cd9464ed0329d478a19c031f38463c4906c339b6dad46cb29b1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page