A collection of code assembled to help streamline things. For Data Analysis.
Project description
===================================================================
----------------------- larkinlab 0.0.20 ------------------------
===================================================================
This library contains the functions I have created or come accross that I find myself using often.
I will be adding things as I see fit, so be sure to update to the latest version.
Check the CHANGELOG for release info.
======== In The Future ========
- v0.1 in the works
========================================================================================
------------------------- Code Descriptions ------------------------------------------
========================================================================================
----- to install/update ------
pip3 install larkinlab
pip3 install --upgrade larkinlab
--------- Subpackages --------
larkinlab.explore
larkinlab.machinelearning
--------------------------------
========================= ll.explore =============================
This is built for exploring data. Contains functions that help you get an understanding of the data at hand quickly.
Import
> from larkinlab import explore as llex
> import larkinlab.explore as llex
Dependencies
> pandas
> numpy
> matplotlib.pyplot
> seaborn
--------------------------------
-- functions --
--------------------------------
-------------------------------------
* llex.df_ex(df, head_val) *
The df_ex (dataframe explore) function takes a dataframe and returns a few basic things
- The number of rows, columns, and total data points
- The names of the columns, limited to the first 60 if more than 60 exist
- Displays up to the first n rows of the dataframe via the df.head method, set by head parameter.
Parameter Default Values
> df :: pandas DataFrame
> head_val =5 :: Sets the number of rown to display in the dataframe preview. Works via the pandas .head method. Set to 'all' for all rows
-------------------------------------
* llex.vcount_ex(df, print_count) *
The vcount_ex function returns the value counts and normalized value counts for all of columns in the dataframe passed through it.
Parameter Default Values
> df :: pandas DataFrame
> print_count =5 :: sets the number of value counts to print for each column. Set to 'all' for all of them, for example - (df, print_count='all')
-------------------------------------
* llex.missing_ex(df) *
The missing_ex function prints the number of missing values in each column of the dataframe passed through it.
Parameter Default Values
> df :: pandas DataFrame
-------------------------------------
* llex.scat_ex(df) *
The scat_ex function returns a scatterplot representing the value counts and thier respective occurances for each column in the dataframe passed through it.
Parameter Default Values
> df :: pandas DataFrame
-------------------------------------
* llex.corr_ex(df, min_corr, min_count, fig_size, colors) *
The corr_ex function returns either a pearson correlation values chart and a heatmap of said correlation values, or only the heatmap, for all of the columns in the dataframe passed through it.
Parameter Default Values
> df :: pandas DataFrame
> min_corr =0.2 :: minimum correlation value to appear on heatmap
> min_count =1 :: minimum number of observations required per pair of columns to have a valid result(pandas.df.corr(min_periods) argument)
> fig_size =(8, 10) :: heatmap size, 2 numbers
> colors ='Reds' :: color of the heatmap. Heatmap from seaborn, so uses thier color codes
-------------------------------------
* llex.help(desc=False) *
A function to list all of the functions in the subpackage, with a description of them an optional argument
Parameter Default Values
> desc =False :: Description. A True value will list function along with description and perameters
-------------------------------------
* *
-------------------------------------
========================= ll.machinelearning =============================
This package contains streamlined machine learning models and evaluation tools
Import
> from larkinlab import machinelearning as llml
> import larkinlab.machinelearning as llml
Dependencies
> pandas
> numpy
> matplotlib.pyplot
--------------------------------
-- functions --
--------------------------------
-------------------------------------
* *
-------------------------------------
* *
-------------------------------------
* *
-------------------------------------
=========================================================================================================================
-------------------------------------------------------------------------------------------------------------------------
=========================================================================================================================
Created By: Conor E. Larkin
email: conor.larkin16@gmail.com
GitHub: github.com/clarkin16
LinkedIn: linkedin.com/in/clarkin16
Thanks for checking this out!
_______________________________________________________________
====================================
----------- CHANGE LOG -----------
====================================
------ Latest Release: 0.0.20 -----
====================================
( Current Version )
0.0.20 (9/30/2021)
----------------------
- fixed explore module issues
====================================
OLD RELEASES
====================================
0.0.19 (9/30/2021)
----------------------
- another quick fix to a file name causing errors
0.0.18 (9/30/2021)
----------------------
- quick fix to a file name causing errors
0.0.17 (9/30/2021)
----------------------
- changelog formatting
- function name changes: dframe_ex to df_ex, func_list to help,
- readme changes
- added an "_help" variable to quickly print an individual function's readme section in a pinch, removed previous desc and params approach
- v0.1 coming
0.0.16 (11/2/2020)
----------------------
- fixed typo in explore's explore_info_list
0.0.15 (11/2/2020)
----------------------
- added func_list() to explore subpackage, with desc arg (defaulted to False)
- added function_desc and function_params lists to code, as well as a dictionary with all functions and descriptions
0.0.14 (11/2/2020)
----------------------
- changed long description content type to text/plain instead of test/markdown.
- fixed code issue llex.vcount_ex() function
- readme style changes
0.0.13 (11/2/2020)
----------------------
- readme updates
- added print_count param to llex.vcount_ex() function
- added head_val and max_col param to .explore's dframe_ex() function. Default max columns printed is now 50
0.0.12 (10/29/2020)
----------------------
- changed error in .explore's missing_ex() function's code
- updated .explore corr_ex() function to include min_count arg
- changed .explore.corr_ex() arg hm_only to map_only() with True or False keywords
0.0.11 (10/29/2020)
----------------------
- changed "install_required" values in setup.py
0.0.10 (10/29/2020)
----------------------
- fixed an error in corr_ex() function's code
0.0.9 (10/29/2020)
----------------------
- readme improvements
- added function missing_ex() to .explore
- added function corr_ex() to .explore
- .explore added seaborn dependency
- description change
0.0.8 (10/29/2020)
----------------------
- readme improved
- changed description
0.0.7 (10/29/2020)
----------------------
- updated name to larkinlab from clarklib
- added 2 subpackages: explore, machinelearning
- changed explore.frame_ex to explore.dframe_ex
- deleted clarklib (v0.0.0 - v0.0.6) from pypi, v0.0.7 and onward will be known as larkinlab
0.0.6 (10/29/2020)
----------------------
- Changed README to larkinlab format, with subpackages.
- In The Future section
- commented out long_description in setup.py
- changed check_df() to frame_ex()
- changed vcount_examine() to vcount_ex()
- changed scat_examine() to scat_ex()
0.0.5 (10/28/2020)
----------------------
- Changed the ghangelog to be in descending chronological order
- Changed description in setup.py
- updated the readme to contain details on using the functions and contact info
0.0.4 (10/28/2020)
----------------------
- Changed check_df() function to only display up to 60 column names.
- Changed check_df() to print "Rows:", "Columns:", and "Total Data Points:" instead of just print(df.shape, df.size)
0.0.3 (10/28/2020)
----------------------
- Added the 'import' section to code in clarklib init file. Works now!
0.0.2 (10/27/2020)
----------------------
- Moved init file into folder
0.0.1 (10/27/2020)
----------------------
- First release
- Added 3 functions: check_df(), vcount_examine(), scat_examine()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
larkinlab-0.0.20.tar.gz
(9.1 kB
view details)
File details
Details for the file larkinlab-0.0.20.tar.gz
.
File metadata
- Download URL: larkinlab-0.0.20.tar.gz
- Upload date:
- Size: 9.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.6.1 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 52a4033613965e5a72746a35307bad514805eee212ef61deeb1d6d5b69c4396a |
|
MD5 | db51d78d55eb6d9bc9174916609143db |
|
BLAKE2b-256 | 2786393a6f3421e012e24b3caf7f8c7484b13d8eef4eff88b5e93e5b10b5a085 |