skestimate

fit estimate utility

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

skestimate package

This package is used to fit and get various metrics for a multi class classifier. It is based on scikit-learn library and uses the methods from that library.

fit_est module inside the package has a class named Xest(self, estimator, data, target_label, ts).

estimator: A piplined classifier that encodes all the categorical features.
data: Raw dataset including the target label
target_label: A string type representing the name of the target label
ts: A number between 0 and 1 that specifies the test portion of the data.

Below is an example of how to use the class methods on the CoverType data set from UCI repository.

$ data = pd.read_csv("https://github.com/skhabiri/PredictiveModeling-CoverType-u2build/blob/master/data/train.csv?raw=true")
rfc = make_pipeline(
    RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
                           criterion='entropy', max_depth=14, max_features=20,
                           max_leaf_nodes=None, max_samples=None,
                           min_impurity_decrease=0.0, min_impurity_split=None,
                           min_samples_leaf=2, min_samples_split=10,
                           min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=-1,
                           oob_score=False, random_state=42, verbose=0,
                           warm_start=False)
                    )
xest = skestimate.Xest(rfc, data, "Cover_Type", 0.2)

For local testing we can use the example() function in fit_est.py.

>>> import skestimate	
>>> myest = skestimate.example()
>>> myest.xskew(0.9)

Available methods associated with Xest class:

xunique(): Reports counts of unique values in each column of data
xskew(imb=0.99):
Returns a pandas Series of the sorted column with skewness more than imb. imb is between 0 and 1
xfit():
Fits the pipeline estimator and returns fitted estimator, training score, and test score
xscore(fit=True):
Calculates accuracy, recall and precision of a classifier and plots the confusion matrix

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.0.3

Jan 22, 2021

0.0.2

Jan 18, 2021

This version

0.0.1

Jan 18, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skestimate-0.0.1.tar.gz (5.4 kB view hashes)

Uploaded Jan 18, 2021 Source

Built Distribution

skestimate-0.0.1-py3-none-any.whl (5.8 kB view hashes)

Uploaded Jan 18, 2021 Python 3

Hashes for skestimate-0.0.1.tar.gz

Hashes for skestimate-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`ad919fcecc5c23e0314c945dc43e4f6112891316cec9bda6d625cd109f2c16a4`
MD5	`4a541c6f439d2578ed197555b2a3a631`
BLAKE2b-256	`0cd92c0b27e2e6fa396e636565f72617ff6f85daa4e491859d4c054969c311d1`

Hashes for skestimate-0.0.1-py3-none-any.whl

Hashes for skestimate-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8648a08a50ea708ec3f33a0477674c315b308e85e9d440137b2b504eb791c703`
MD5	`2e3afb85c621a51d7053a5a5dc9a864e`
BLAKE2b-256	`340a275f50a05721462c88ea7e6a47ea360dfe8585610da1a099264e4578cefd`