Skip to main content

Data Science Library

Project description

polar

polar is a Python module that contains simple to use data science functions. It is built on top of SciPy, scikit-learn, seaborn and pandas.

Installation

If you already have a working installation of numpy and scipy, the easiest way to install parkitny is using pip:

pip install polar seaborn pandas scikit-learn scipy matplotlib numpy nltk -U

Dependencies

polar requires:

  • Python (>= 3.5)
  • NumPy (>= 1.11.0)
  • SciPy (>= 0.17.0)
  • Seaborn (>= 0.9.0)
  • scikit-learn (>= 0.21.3)
  • nltk (>= 3.4.5)
  • python-pptx (>= 0.6.18)
  • cryptography (> 2.8)

Jupyter Notebook Examples

Here is the link to the jupyter notebook with all the exmples that are described below Polar-Examples

ACA (Automated Cohort Analysis) Example

The ACA creates three heatmaps for each feature in the data set.

  • Conversion heatmap - conversion per feature value
  • Distribution heatmap - distribution per feature value
  • Size heatmap - total samples per feature value

Data File: ACA_date.csv

Final Result Power Point: ACA.pptx

import pandas as pd
import polar as pl
from pptx import Presentation
%matplotlib inline

url = "https://raw.githubusercontent.com/pparkitn/imagehost/master/ACA_date.csv"
data_df=pd.read_csv(url)

prs = Presentation()    
pl.create_title(prs,'ACA')
for chart in pl.ACA_create_graphs(data_df,'date','label'):
    pl.add_chart_slide(prs,chart[0],chart[1])
pl.save_presentation(prs,filename = 'ACA')

Conversion: Image

Distribution: Image

Samples: Image

EDA Example

import pandas as pd
import openml
import polar as pl

dataset = openml.datasets.get_dataset(31)
X, y, categorical_indicator, attribute_names = \
dataset.get_data(target=dataset.default_target_attribute,dataset_format='dataframe')

openml_df = pd.DataFrame(X)
openml_df['target'] = y

data_df = pl.analyze_correlation(openml_df,'target')
pl.get_heatmap(data_df,'correlation_heat_map.png',1.1,14,'0.1f',0,100,5,5)

Image

data_df = pl.analyze_association(openml_df,'target',verbose=0)
pl.get_heatmap(data_df,'association_heat_map.png',1.1,12,'0.1f',0,100,10,10)

Image

print(pl.analyze_df(openml_df, 'target',10))

Image

data_df = pl.get_important_features(openml_df,'target')
pl.get_bar(data_df,'bar.png','Importance','Feature_Name')

Image

NLP Example

import nltk
nltk.download('wordnet')
import pandas as pd
import polar as pl
from cryptography.fernet import Fernet

url = "https://raw.githubusercontent.com/pparkitn/imagehost/master/test_real_or_not_from_kaggle.csv"
data_df=pd.read_csv(url)

data_df.drop(columns=['id','keyword','location'], inplace=True)
data_df.head(3)

Image

key = Fernet.generate_key()
data_df['text_encrypted'] =  data_df['text'].apply(pl.encrypt_df,args=(key,))
data_df['text_decrypted'] =  data_df['text_encrypted'].apply(pl.decrypt_df,args=(key,))

data_df['text_stem'] = data_df['text_decrypted'].apply(pl.nlp_text_process,args=('stem',))
data_df['text_stem_lem'] = data_df['text_stem'].apply(pl.nlp_text_process,args=('lem',))

data_df.head(3)

Image

cluster_df = pl.nlp_cluster(data_df, 'text_stem_lem',  10, 'text_cluster',1.0,1,100,1,'KMeans')
cluster_df.groupby(['text_cluster']).count()

Image

cluster_df[cluster_df['text_cluster']==9]['text_stem_lem']

Image

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polar-0.0.101.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polar-0.0.101-py3-none-any.whl (10.0 kB view details)

Uploaded Python 3

File details

Details for the file polar-0.0.101.tar.gz.

File metadata

  • Download URL: polar-0.0.101.tar.gz
  • Upload date:
  • Size: 11.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.4.2 requests/2.22.0 setuptools/39.1.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.5

File hashes

Hashes for polar-0.0.101.tar.gz
Algorithm Hash digest
SHA256 3e5cd63996a140567bb566f186353524e90f6633d55ee14a9b7575751ec9a7da
MD5 a0b42db88691ff1b39d5c01729672693
BLAKE2b-256 d506794f7594641ec9d9c5fca67246669170dcd824105d4efbf6c4c5a1c63716

See more details on using hashes here.

File details

Details for the file polar-0.0.101-py3-none-any.whl.

File metadata

  • Download URL: polar-0.0.101-py3-none-any.whl
  • Upload date:
  • Size: 10.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.4.2 requests/2.22.0 setuptools/39.1.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.5

File hashes

Hashes for polar-0.0.101-py3-none-any.whl
Algorithm Hash digest
SHA256 a50e22a9d3df51d30303c483277faf11bd01b7bf139ce099eace3ee3f9ff71f0
MD5 067399cde7154a8fb4f5c60a52aef4fc
BLAKE2b-256 1a386409b71af801192afe9a3a18c2ae4d66d6145696f5d0a3c7757b76515778

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page