Skip to main content

Quickly try out several ML models on a given dataset

Project description

Hundred Hammers

"At least one of them is bound to do the trick."

Hundred Hammers is a Python package that helps you batch-test ML models in a dataset. It can be used out-of-the-box to run most popular ML models and metrics, or it can be easily extended to include your own.

  • Supports both classification and regression.
  • Already comes strapped with most sci-kit learn models.
  • Already comes with several plots to visualize the results.
  • Easy to integrate with parameter tuning from GridSearch CV.
  • Already gives you the average metrics from training, test, validation (train) and validation (test) sets.
  • Allows you to define how many seeds to consider, so you can increase the significance of your results.
  • Produces a Pandas DataFrame with the results (which can be exported to CSV and analyzed elsewhere).

Installation

Clone the repository and run pip install . in the root directory.

Examples

Full examples can be found in the test directory. Here's a simple example of how to use Hundred Hammers to run a batch classification on Iris data:

from hundred_hammers.classifier import HundredHammersClassifier
from hundred_hammers.plots import plot_batch_results
from sklearn.datasets import load_iris

data = load_iris()
X, y = data.data, data.target

hh = HundredHammersClassifier()
df_results = hh.evaluate(X, y)

plot_batch_results(df_results, metric_name="Accuracy", title="Iris Dataset")

This already gives us a DataFrame with the results from several different models, and a nice plot of the results:

Other plots

We can also use Hundred Hammers to produce nice confusion matrices plots and regression predictions:

from hundred_hammers.plots import plot_confusion_matrix
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier

data = load_iris()
X, y = data.data, data.target
plot_confusion_matrix(X, y, class_dict={0: "Setosa", 1: "Versicolor", 2: "Virginica"},
                      model=DecisionTreeClassifier(), title="Iris Dataset")

from hundred_hammers.plots import plot_regression_pred
from sklearn.datasets import load_diabetes
from sklearn.metrics import mean_squared_error
from sklearn.dummy import DummyRegressor

data = load_diabetes()
X, y = data.data, data.target
plot_regression_pred(X, y, models=[DummyRegressor(strategy='median'), best_model], metric=mean_squared_error,
                     title="Diabetes", y_label="Diabetes (Value)")

Finally, it is also possible to compare different datasets and compare their results (each dot is a model).

data = load_iris()
X, y = data.data, data.target

hh = HundredHammersClassifier()

df = []
for i, feature_name in enumerate(data.feature_names):
    X_i = X[:, [j for j in range(X.shape[1]) if j != i]]

    for degree in range(8):
        df_i = hh.evaluate(X_i ** degree, y, optim_hyper=False)
        df_i["Dataset"] = f"$X^{degree}$, w/out $x_{i}$"
        df.append(df_i)

df_results = pd.concat(df, ignore_index=True)
plot_multiple_datasets(df_results, metric_name="Avg ACC (Validation Test)", id_col="Dataset", title="Iris Dataset", display=True)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hundred-hammers-1.0.2.tar.gz (16.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hundred_hammers-1.0.2-py3-none-any.whl (17.4 kB view details)

Uploaded Python 3

File details

Details for the file hundred-hammers-1.0.2.tar.gz.

File metadata

  • Download URL: hundred-hammers-1.0.2.tar.gz
  • Upload date:
  • Size: 16.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for hundred-hammers-1.0.2.tar.gz
Algorithm Hash digest
SHA256 55c8c20080b44da2b1d63ceb8b9ce54040e202e3966aa432418159f159a1742a
MD5 bed9a058bae24c4a507b0a0c21e4a186
BLAKE2b-256 050f395fb035609418d5cd906d716887385a1bbea95a2369a10557981af57d4b

See more details on using hashes here.

File details

Details for the file hundred_hammers-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for hundred_hammers-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b44e58cfa9a390c1eec7af5954f4a9b34f01e924376a9ec0a2f2b9450d0c9a21
MD5 967801cb774324d9e43c2f68b4b995a7
BLAKE2b-256 180b0ecb7cde1c406a97e54274b9872eff5e60b30caf562e4a310c30f9767468

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page