Skip to main content

ML Benchmark used to compare CPUs

Project description

ulhpc_ml_benchmark

Benchmark of many ML algorithms at once for easy platform evaluation.

This repo aims evaluating +40 Machine Learning algorithms in training and inference modes. It makes easier the evaluation of platforms based on the performance of realistic algorithms instead of using FLOPS or other CPU characteristics. Algorithms have different complexity (in Big-O notation) and may require several order of magnitude of time. This is why we structure the benchmark to assess the number of data samples processed within a fixed amount of time, rather than measuring the computational time for a fixed quantity of data. The latter approach would be impractical due to the vast differences in processing times—ranging from milliseconds to hours—across various algorithms.

Utilization example:

(base) pierrick@LinuxUniBXD7LS3:~/project/ulhpc_ml_benchmark$ python3
Python 3.10.9 (main, Jan 11 2023, 15:21:40) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from bench import bench
>>> bench(num_samples=1000, num_features=100, fix_comp_time=1, reg_or_cls="reg")

bench(num_samples=100, num_features=10, fix_comp_time=1) represents a benchmark on 100 data points, 10 features per point, and each algorithm is trained 1 second.

After ~50 seconds the output is:

LassoCV 0.234817 16749
ElasticNetCV 0.33169 11628
TheilSenRegressor 1.527735 18588
MultiTaskElasticNetCV 6.302748 17154
MultiTaskLassoCV 6.324106 17001
QuantileRegressor 6.912841 18286
GaussianProcessRegressor 19 24
RandomForestRegressor 25 535
MLPRegressor 26 4021
ExtraTreesRegressor 39 598
GradientBoostingRegressor 56 4370
MultiTaskLasso 60 15964
MultiTaskElasticNet 61 17405
HistGradientBoostingRegressor 65 1073
HuberRegressor 80 17127
BaggingRegressor 148 723
NuSVR 148 14297
ElasticNet 170 11316
KernelRidge 181 866
Lasso 219 16441
RidgeCV 377 20126
ARDRegression 416 18103
BayesianRidge 436 17228
RANSACRegressor 565 14170
SGDRegressor 768 11772
LassoLarsIC 782 17114
AdaBoostRegressor 1033 5163
PassiveAggressiveRegressor 1277 18707
TweedieRegressor 1304 11981
TransformedTargetRegressor 1312 7764
PLSRegression 1522 4397
OrthogonalMatchingPursuit 1549 17003
LassoLars 1592 16402
Ridge 1621 19856
Lars 1673 17026
GammaRegressor 1812 16029
PoissonRegressor 1915 17134
LinearSVR 1916 18146
SVR 2147 13748
LinearRegression 2219 17062
ExtraTreeRegressor 4364 11644
DecisionTreeRegressor 4687 12640
KNeighborsRegressor 7255 280
RadiusNeighborsRegressor 7347 63
DummyRegressor 13247 185039

Now let's benchmark classifiers

>>> bench(num_samples=1000, num_features=100, fix_comp_time=1, reg_or_cls="cls")

returns

GaussianProcessClassifier 12 29
RandomForestClassifier 17 323
ExtraTreesClassifier 23 338
AdaBoostClassifier 24 158
GradientBoostingClassifier 32 4082
MLPClassifier 38 3886
SVC 40 24
LogisticRegressionCV 44 15586
HistGradientBoostingClassifier 71 1216
LabelSpreading 109 164
BaggingClassifier 123 715
CalibratedClassifierCV 124 642
LabelPropagation 141 165
CategoricalNB 208 976
RidgeClassifierCV 285 14668
QuadraticDiscriminantAnalysis 437 1805
SGDClassifier 448 15471
Perceptron 701 16004
RidgeClassifier 742 14513
LinearSVC 804 16303
BernoulliNB 858 2100
ComplementNB 1073 10361
PassiveAggressiveClassifier 1155 15967
LogisticRegression 1167 15897
MultinomialNB 1655 13369
GaussianNB 1887 3059
DecisionTreeClassifier 2519 11871
ExtraTreeClassifier 2751 11878
NearestCentroid 3846 3399
RadiusNeighborsClassifier 5069 96
KNeighborsClassifier 5436 548
DummyClassifier 15287 121029

For each algorithm line, the benchmark output provides the following information:

  • Algorithm name
  • Data points ingested per unit of time during training. Units are num_samples per fix_comp_time seconds.
  • Data points ingested per unit of time during inference.

Notice: the line are sorted according the 1st column.

Example of experiments on 4 CPUs:

Result

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ulhpc_ml_benchmark-0.0.2.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

ulhpc_ml_benchmark-0.0.2-py3-none-any.whl (12.0 kB view details)

Uploaded Python 3

File details

Details for the file ulhpc_ml_benchmark-0.0.2.tar.gz.

File metadata

  • Download URL: ulhpc_ml_benchmark-0.0.2.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for ulhpc_ml_benchmark-0.0.2.tar.gz
Algorithm Hash digest
SHA256 8d264bca26cab818ee43ee186b419e7546565c36be2a54496122ef25de5edc30
MD5 f7ce82fd00fc68027ecea7d1c1efdbbe
BLAKE2b-256 34577272497b945f2dc209b1f175d47e24a1e74cacbded5670a8bcaa5e0a3c99

See more details on using hashes here.

File details

Details for the file ulhpc_ml_benchmark-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for ulhpc_ml_benchmark-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7b16a0aab6222a5989294f6b59a021c2d50a3a8a7a96477a9dfead43e734cbba
MD5 b721b195bac29534e677817c7884fa2c
BLAKE2b-256 5d3e6d641fabcba9de6bbc767f4da030baed0aec3e2d5f895fcb0159f9895671

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page