Data Science Library
Project description
polar
polar is a Python module that contains simple to use data science functions. It is built on top of SciPy, scikit-learn, seaborn and pandas.
Installation
If you already have a working installation of numpy and scipy,
the easiest way to install parkitny is using pip:
pip install polar seaborn pandas scikit-learn scipy matplotlib numpy nltk -U
Dependencies
polar requires:
- Python (>= 3.5)
- NumPy (>= 1.11.0)
- SciPy (>= 0.17.0)
- Seaborn (>= 0.9.0)
- scikit-learn (>= 0.21.3)
- nltk (>= 3.4.5)
- python-pptx (>= 0.6.18)
- cryptography (> 2.8)
Jupyter Notebook Examples
Here is the link to the jupyter notebook with all the exmples that are described below Polar-Examples
ACA (Automated Cohort Analysis) Example
The ACA creates three heatmaps for each feature in the data set.
- Conversion heatmap - conversion per feature value
- Distribution heatmap - distribution per feature value
- Size heatmap - total samples per feature value
Data File: ACA_date.csv
Final Result Power Point: ACA.pptx
import pandas as pd
import polar as pl
from pptx import Presentation
%matplotlib inline
url = "https://raw.githubusercontent.com/pparkitn/imagehost/master/ACA_date.csv"
data_df=pd.read_csv(url)
prs = Presentation()
pl.create_title(prs,'ACA')
for chart in pl.ACA_create_graphs(data_df,'date','label'):
pl.add_chart_slide(prs,chart[0],chart[1])
pl.save_presentation(prs,filename = 'ACA')
Conversion:
Distribution:
Samples:
EDA Example
import pandas as pd
import openml
import polar as pl
dataset = openml.datasets.get_dataset(31)
X, y, categorical_indicator, attribute_names = \
dataset.get_data(target=dataset.default_target_attribute,dataset_format='dataframe')
openml_df = pd.DataFrame(X)
openml_df['target'] = y
data_df = pl.analyze_correlation(openml_df,'target')
pl.get_heatmap(data_df,'correlation_heat_map.png',1.1,14,'0.1f',0,100,5,5)
data_df = pl.analyze_association(openml_df,'target',verbose=0)
pl.get_heatmap(data_df,'association_heat_map.png',1.1,12,'0.1f',0,100,10,10)
print(pl.analyze_df(openml_df, 'target',10))
data_df = pl.get_important_features(openml_df,'target')
pl.get_bar(data_df,'bar.png','Importance','Feature_Name')
NLP Example
import nltk
nltk.download('wordnet')
import pandas as pd
import polar as pl
from cryptography.fernet import Fernet
url = "https://raw.githubusercontent.com/pparkitn/imagehost/master/test_real_or_not_from_kaggle.csv"
data_df=pd.read_csv(url)
data_df.drop(columns=['id','keyword','location'], inplace=True)
data_df.head(3)
key = Fernet.generate_key()
data_df['text_encrypted'] = data_df['text'].apply(pl.encrypt_df,args=(key,))
data_df['text_decrypted'] = data_df['text_encrypted'].apply(pl.decrypt_df,args=(key,))
data_df['text_stem'] = data_df['text_decrypted'].apply(pl.nlp_text_process,args=('stem',))
data_df['text_stem_lem'] = data_df['text_stem'].apply(pl.nlp_text_process,args=('lem',))
data_df.head(3)
cluster_df = pl.nlp_cluster(data_df, 'text_stem_lem', 10, 'text_cluster',1.0,1,100,1,'KMeans',(1,2))[0]
cluster_df.groupby(['text_cluster']).count()
cluster_df[cluster_df['text_cluster']==9]['text_stem_lem']
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polar-0.0.115.tar.gz.
File metadata
- Download URL: polar-0.0.115.tar.gz
- Upload date:
- Size: 11.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.4.2 requests/2.22.0 setuptools/39.1.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc73a291e8f04c84c0f21ae198d806c07fb48bb1984b55902e0a16dd8f46c2d5
|
|
| MD5 |
7f1e69ff15c421dd2e934cfeae8d6014
|
|
| BLAKE2b-256 |
3021be66f5dd57fdb6e20c56e23f3cd591660d060ca6f1cc747486fcbbce87aa
|
File details
Details for the file polar-0.0.115-py3-none-any.whl.
File metadata
- Download URL: polar-0.0.115-py3-none-any.whl
- Upload date:
- Size: 10.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.4.2 requests/2.22.0 setuptools/39.1.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5774aa74ab0f0fddc3efdcbcced15f0c3ecee10b60d5281a4c187da3ceb71daa
|
|
| MD5 |
565c7d8a27e207958f59b9ebf93a1147
|
|
| BLAKE2b-256 |
2a479f396b7f285b6169b74d5acae104660364148cab5ce85041d4eb181aef1e
|