Skip to main content

Data Mining Utils

Project description

README

dm_utils is a utility for Data Mining.

Installation

pip install dm_utils

Usage

  • dm_utils.hom : hold-out method
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from dm_utils.hom import HOM

x, y = load_iris(return_X_y=True, as_frame=True)
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.2, random_state=42)
# classification task, xgboost and lightgbm model
hom = HOM(task='cls', model=['xgb', 'lgb'])
hom.fit(xtrain, ytrain, record_time=True)
ypred = (hom.predict(xtest) > 0.5).argmax(axis=1)
print(accuracy_score(ypred, ytest))
  • dm_utils.oof : out of fold prediction
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from dm_utils.oof import OOF

x, y = load_breast_cancer(return_X_y=True, as_frame=True)
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.2, random_state=42)
# classification task, 2*xgboost, 2*lightgbm and 1*catboost model for 5-fold oof
oof = OOF(task='cls', model=['xgb', 'xgb', 'lgb', 'lgb', 'cb'])
oof.fit(xtrain, ytrain, record_time=True)
ypred = oof.predict(xtest) > 0.5
print(accuracy_score(ypred, ytest))

Features

support algorithm: scikit-learn, xgboost, lightgbm, catboost, ngboost and pytorch-tabnet

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dm_utils-0.1.1.tar.gz (18.7 kB view details)

Uploaded Source

Built Distribution

dm_utils-0.1.1-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file dm_utils-0.1.1.tar.gz.

File metadata

  • Download URL: dm_utils-0.1.1.tar.gz
  • Upload date:
  • Size: 18.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.13

File hashes

Hashes for dm_utils-0.1.1.tar.gz
Algorithm Hash digest
SHA256 2c9744519ef90745bef28d11b64cc4e25c3ea0be729b1e99c906eddf7f129e79
MD5 c0be482e9b9c750d47473582d01d2552
BLAKE2b-256 2642e0eb2cb15a61705e72a6fec8150e9f3c256772633a44187a530873f34367

See more details on using hashes here.

File details

Details for the file dm_utils-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: dm_utils-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 24.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.13

File hashes

Hashes for dm_utils-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 868b8bb469e1a2bb79645d7da2427537f0160f4523a5fbea135f01427d5d49db
MD5 b26cf39697727bea2746b2e5d98dc985
BLAKE2b-256 c83d8199c9789fc1809dd3fa4dc0f45e46f370ea3c4b19bf150a674282545b95

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page