Data Science utilities in python.
Project description
Data Science Utilities
Data Science utilities in python.
Free software: MIT license
Documentation: https://data-science-utilities.readthedocs.io.
Features
Missing Data Statistic
>>> from data_science_utilities import data_science_utilities # make statistic missing_data = data_science_utilities.missing_data_stats(df) # display statistic missing_data
Read CSV files from path
>>> from data_science_utilities import data_science_utilities train_path = '../data/raw/train.csv' test_path = '../data/raw/test.csv' X_train, X_test = data_science_utilities.read_csv_files(train_path, test_path)
Plotting distribution normal
>>> from data_science_utilities import data_science_utilities data_science_utilities.plot_dist_norm(dist, 'distribution normal')
Plotting correlation matrix
>>> from data_science_utilities import data_science_utilities data_science_utilities.plot_corelation_matrix(data)
Plotting top attributes correlation matrix
>>> from data_science_utilities import data_science_utilities data_science_utilities.plot_top_corelation_matrix(data, target, k=10, cmap='YlGnBu')
Plotting attributes by scatter chart
>>> from data_science_utilities import data_science_utilities data_science_utilities.plot_scatter(data, column_name, target)
Plotting attributes by box bar
>>> from data_science_utilities import data_science_utilities data_science_utilities.plot_box(data, column_name, target)
Plotting category by box bar
>>> from data_science_utilities import data_science_utilities data_science_utilities.plot_category_columns(data, limit_bars=10)
Generate a simple plot of the test and traning learning curve
>>> from data_science_utilities import data_science_utilities data_science_utilities.plot_learning_curve(estimator, title, X, y, ylim=None, cv=None, train_sizes=np.linspace(.1, 1.0, 5))
Credits
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
History
0.2.0 (2018-05-14)
Adds utils about visualization.
0.1.0 (2018-05-11)
First release on PyPI.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for data_science_utilities-0.2.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1fb0eb6803600e94126ad93958b4cf269a175304b5e228e5700e619a2ae844c7 |
|
MD5 | 14b9a22a5c9944b73dae871b23ebfddd |
|
BLAKE2b-256 | 68aa0bbbaf32d00d2b984e2a6aaa7a4c7ac1eab6fa8b70c133f53094c80f56dc |
Close
Hashes for data_science_utilities-0.2.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e194ec0f49852419f0cc9e70ccc1e43d01b485b23cd1fa8ee2e214d799fcf242 |
|
MD5 | d9c2df41c73fe24c31f482a8d2c6bd92 |
|
BLAKE2b-256 | dbe99904d808f3e19d597f374d09d37ae6c7c5ff6637da95cf1581edd6419e6b |