AllML — All-in-One Machine Learning Library

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

AllML — All-in-One Machine Learning Library

One import to rule them all. Load a CSV, engineer your features, pick your columns, choose a method, train, predict. Done.

from allml import AllML

ml = AllML("your_data.csv")
ml.col_to_feed(["feature1", "feature2", "feature3"])
ml.col_to_pred(["target"])
ml.method("random_forest_regressor")
ml.split(70, 20, 10)
ml.train(n_estimators=200)
ml.show()
ml.save("models/my_model")

result = ml.pred(...)
print(result)

What is AllML?

AllML is a unified machine learning interface that wraps every major scikit-learn algorithm — plus XGBoost, LightGBM, and CatBoost when installed — behind a single consistent API.

No boilerplate. No manual preprocessing. No separate scalers, encoders, or train/test split code. You give it a CSV file and tell it what to do. AllML handles everything else automatically.

String and categorical columns are encoded automatically
Raw string values can be passed directly into ml.pred()
Missing values are reported on load and handled with one line
Saving a model bundles everything needed to predict later — no CSV required at inference time

It automatically detects whether your task is:

Regression — predicting a decimal number
Binary Classification — predicting 0 or 1
Multi-class Classification — predicting one of many categories
Multi-Output Regression — predicting several numbers at once
Multi-Output Classification — predicting several categories at once

Installation

pip install allml

Optional boosting libraries for extra algorithms:

pip install xgboost lightgbm catboost

API Reference

`AllML(csv_filepath=None)`

Load your dataset. Providing a CSV is required for training. For loading a saved model and predicting, no CSV is needed.

# Training mode — CSV required
ml = AllML("your_data.csv")

# Inference mode — no CSV needed
ml = AllML()
ml.load("models/my_model.allml")

When a CSV is loaded, AllML automatically reports how many missing values exist and which columns are affected.

`ml.engineer`

Feature engineering module attached to every AllML instance. All operations are applied directly to the loaded DataFrame and logged so you can inspect every step with ml.engineer.report().

Fill Missing Values

ml.engineer.fill_missing("column_name", strategy="mean")
ml.engineer.fill_missing("column_name", strategy="median")
ml.engineer.fill_missing("column_name", strategy="constant", fill_value=0)
ml.engineer.fill_missing(strategy="mode")           # all columns
ml.engineer.fill_missing(strategy="ffill")          # forward fill
ml.engineer.fill_missing(strategy="bfill")          # backward fill
ml.engineer.fill_missing(strategy="knn", knn_neighbors=5)
ml.engineer.fill_missing(strategy="iterative")      # MICE imputation

Strategy	Description
`mean`	Replace with column mean
`median`	Replace with column median
`mode`	Replace with most frequent value
`constant`	Replace with a fixed value
`ffill`	Forward fill from the previous row
`bfill`	Backward fill from the next row
`knn`	K-Nearest Neighbours imputation
`iterative`	MICE iterative imputation

Drop Columns and Rows

ml.engineer.drop_columns(["col1", "col2"])
ml.engineer.drop_duplicates()
ml.engineer.drop_duplicates(subset=["id_column"])
ml.engineer.drop_missing_rows()
ml.engineer.drop_missing_rows(columns=["col1", "col2"])
ml.engineer.drop_missing_rows(threshold=0.5)

Encode Categorical Columns

ml.engineer.encode("column_name", method="label")
ml.engineer.encode(["col1", "col2"], method="onehot")
ml.engineer.encode("column_name", method="ordinal",
                   categories=["low", "medium", "high"])
ml.engineer.encode("column_name", method="binary")

Method	Description
`label`	Integer label encoding — 0, 1, 2, ...
`onehot`	One-hot encoding — creates new columns
`ordinal`	Ordered integer encoding using a provided category list
`binary`	Maps exactly 2 unique values to 0 and 1

Note: You do not need to manually encode columns before training. AllML encodes all string columns in col_to_feed() and col_to_pred() automatically at train time and applies the same encoding at predict time. Use ml.engineer.encode() only when you want explicit control over how a column is encoded.

Scale and Normalise

ml.engineer.scale("column_name", method="standard")
ml.engineer.scale("column_name", method="minmax")
ml.engineer.scale("column_name", method="robust")
ml.engineer.scale("column_name", method="log")
ml.engineer.scale("column_name", method="sqrt")
ml.engineer.scale(["col1", "col2"], method="standard")

Method	Description
`standard`	Zero mean, unit variance — z-score
`minmax`	Scale to range [0, 1]
`robust`	Scale using median and IQR — resistant to outliers
`log`	Natural log transform — log(x + 1)
`sqrt`	Square root transform

Handle Outliers

ml.engineer.clip_outliers("column_name", method="iqr", factor=1.5)
ml.engineer.clip_outliers("column_name", method="zscore", z_threshold=3.0)
ml.engineer.clip_outliers(["col1", "col2"], method="iqr")

Method	Description
`iqr`	Clip values outside Q1 - factor * IQR and Q3 + factor * IQR
`zscore`	Clip values where the absolute z-score exceeds the threshold

Create New Features

# Polynomial features
ml.engineer.polynomial("column_name", degree=2)
ml.engineer.polynomial(["col1", "col2"], degree=3)

# Interaction feature — multiplies two columns together
ml.engineer.interaction("col_a", "col_b")
ml.engineer.interaction("col_a", "col_b", new_name="custom_name")

# Bin a continuous column into discrete intervals
ml.engineer.bin("column_name", bins=5)
ml.engineer.bin("column_name", bins=[0, 100, 500, 1000],
                labels=["Low", "Medium", "High"])

# Extract features from a datetime column
ml.engineer.datetime_features("datetime_column")
ml.engineer.datetime_features("datetime_column",
                               features=["year", "month", "dayofweek"])

# Apply a custom function to a column
ml.engineer.transform("column_name", lambda x: x / 1000,
                       new_name="new_column_name")

# Rename columns
ml.engineer.rename({"old_name": "new_name"})

Inspect and Visualise

ml.engineer.report()
ml.engineer.plot_missing()
ml.engineer.plot_distribution("column_name")
ml.engineer.plot_distribution(["col1", "col2", "col3"])

Chaining

All engineer operations return self so they can be chained:

(ml.engineer
   .fill_missing("column_name", strategy="mean")
   .drop_duplicates()
   .clip_outliers("column_name", method="iqr")
   .encode("column_name", method="label")
   .scale("column_name", method="minmax")
   .report())

`ml.col_to_feed(columns)`

Set the input feature columns. String and categorical columns are accepted without any manual encoding.

ml.col_to_feed(["col1", "col2", "col3"])

`ml.col_to_pred(columns)`

Set the target column(s) to predict.

# Single target
ml.col_to_pred(["target_column"])

# Multiple targets
ml.col_to_pred(["target1", "target2", "target3"])

AllML detects the task type automatically:

Target column content	Task detected
Float values	Regression
2 unique values	Binary Classification
3 or more unique values	Multi-class Classification
Multiple float columns	Multi-Output Regression
Multiple category columns	Multi-Output Classification

`ml.method(name)`

Choose which algorithm(s) to use.

# Single method
ml.method("method_name")

# Multiple methods trained and compared together
ml.method(["method1", "method2", "method3"])

# Every applicable method for your task type
ml.method("all")

If you select a method that does not match your task type, AllML skips it with a clear explanation and suggests the correct alternatives.

All Available Methods

Regression

Method Name	Algorithm
`linear_regression`	Linear Regression
`ridge`	Ridge Regression
`lasso`	Lasso Regression
`elastic_net`	Elastic Net
`bayesian_ridge`	Bayesian Ridge
`sgd_regressor`	SGD Regressor
`huber_regressor`	Huber Regressor
`ransac_regressor`	RANSAC
`theilsen_regressor`	Theil-Sen
`ard_regression`	ARD Regression
`passive_aggressive_regressor`	Passive Aggressive
`decision_tree_regressor`	Decision Tree
`extra_tree_regressor`	Extra Tree
`random_forest_regressor`	Random Forest
`extra_trees_regressor`	Extra Trees
`gradient_boosting_regressor`	Gradient Boosting
`adaboost_regressor`	AdaBoost
`bagging_regressor`	Bagging
`svr`	Support Vector Regression
`linear_svr`	Linear SVR
`nu_svr`	Nu SVR
`knn_regressor`	K-Nearest Neighbors
`mlp_regressor`	Neural Network (MLP)
`gaussian_process_regressor`	Gaussian Process
`pls_regression`	PLS Regression
`isotonic_regression`	Isotonic Regression (1 feature only)
`lars`	LARS
`lasso_lars`	Lasso LARS
`xgboost_regressor`	XGBoost *
`lightgbm_regressor`	LightGBM *
`catboost_regressor`	CatBoost *

Classification

Method Name	Algorithm
`logistic_regression`	Logistic Regression
`ridge_classifier`	Ridge Classifier
`sgd_classifier`	SGD Classifier
`passive_aggressive_classifier`	Passive Aggressive
`perceptron`	Perceptron
`decision_tree_classifier`	Decision Tree
`extra_tree_classifier`	Extra Tree
`random_forest_classifier`	Random Forest
`extra_trees_classifier`	Extra Trees
`gradient_boosting_classifier`	Gradient Boosting
`adaboost_classifier`	AdaBoost
`bagging_classifier`	Bagging
`svc`	Support Vector Machine
`linear_svc`	Linear SVC
`nu_svc`	Nu SVC
`knn_classifier`	K-Nearest Neighbors
`gaussian_nb`	Gaussian Naive Bayes
`bernoulli_nb`	Bernoulli Naive Bayes
`mlp_classifier`	Neural Network (MLP)
`gaussian_process_classifier`	Gaussian Process
`lda`	Linear Discriminant Analysis
`qda`	Quadratic Discriminant Analysis
`xgboost_classifier`	XGBoost *
`lightgbm_classifier`	LightGBM *
`catboost_classifier`	CatBoost *

Clustering

Method Name	Algorithm
`kmeans`	K-Means
`dbscan`	DBSCAN
`agglomerative_clustering`	Agglomerative Clustering
`mean_shift`	Mean Shift
`spectral_clustering`	Spectral Clustering
`birch`	BIRCH
`mini_batch_kmeans`	Mini-Batch K-Means

* Requires the optional library to be installed.

`ml.split(train_pct, test_pct, val_pct=0)`

Split your data into training, test, and optional validation sets. Percentages must sum to 100.

ml.split(train_pct, test_pct)
ml.split(train_pct, test_pct, val_pct)

`ml.train(...)`

Train the selected model(s). All parameters are optional and are applied to every model where they are applicable.

ml.train(
    epochs         = 100,
    learning_rate  = 0.01,
    n_estimators   = 100,
    max_depth      = None,
    n_neighbors    = 5,
    kernel         = "rbf",
    n_clusters     = 3,
    alpha          = 1.0,
)

Parameter	Applies to
`epochs`	Iterative models — max iterations
`learning_rate`	Gradient-based models
`n_estimators`	Ensemble models — number of trees
`max_depth`	Tree-based models
`n_neighbors`	KNN models
`kernel`	SVM models — `"rbf"`, `"linear"`, `"poly"`
`n_clusters`	Clustering models
`alpha`	Regularised linear models

`ml.show()`

Print a complete summary including dataset info, data splits, training configuration, all performance metrics, missing value report, and the full list of feature engineering steps applied.

ml.show()

`ml.show_graph(graph_type)`

Plot one, several, or all available graphs.

ml.show_graph("graph_name")
ml.show_graph(["graph1", "graph2", "graph3"])
ml.show_graph("all")

Graph	Best for
`confusion_matrix`	Classification
`roc_curve`	Binary Classification
`precision_recall_curve`	Binary Classification
`feature_importance`	Tree and linear models
`learning_curve`	All tasks
`residuals`	Regression
`prediction_vs_actual`	Regression
`error_distribution`	Regression
`model_comparison`	All tasks — bar chart of the main metric
`correlation_heatmap`	All tasks
`distribution`	All tasks
`boxplot`	All tasks
`pairplot`	All tasks
`cluster_plot`	Clustering

`ml.save(path)`

Save trained model(s) to disk.

ml.save("path/to/model_name")

Single method → model_name.allml
Multiple methods → model_name_<method>.allml for each model
Always writes model_name_meta.json — a human-readable summary

Each .allml file bundles everything needed to make predictions later: the trained model, fitted scalers, label encoders for all string columns, column names, task type, metrics, and hyperparameters.

`ml.load(path)`

Load a previously saved model. No CSV required. After loading, ml.pred() works immediately.

ml = AllML()
ml.load("path/to/model_name.allml")

`ml.pred(*values)`

Run a prediction. Pass values in the same order as col_to_feed(). String values are accepted directly for categorical columns.

result = ml.pred(val1, val2, val3, ...)
result = ml.pred(val1, val2, val3, ..., method="method_name")

When multiple models are trained, ml.pred() returns a dictionary mapping each method name to its prediction.

Examples

Regression — Predict a house price

from allml import AllML

ml = AllML("house_prices.csv")

ml.engineer.fill_missing(strategy="mean")
ml.engineer.clip_outliers("price_usd", method="iqr")
ml.engineer.report()

ml.col_to_feed(["area_sqft", "bedrooms", "bathrooms",
                "year_built", "garage_cars", "condition"])
ml.col_to_pred(["price_usd"])
ml.method("gradient_boosting_regressor")
ml.split(70, 20, 10)
ml.train(n_estimators=200, learning_rate=0.05)
ml.show()
ml.show_graph(["prediction_vs_actual", "feature_importance",
               "residuals", "model_comparison"])
ml.save("models/house_model")

price = ml.pred(1360, 2, 1, 1981, 1, "Excellent")
print(f"Predicted price: ${price:,.2f}")

Binary Classification — Predict heart disease risk

from allml import AllML

ml = AllML("heart_disease.csv")

ml.engineer.fill_missing(strategy="median")
ml.engineer.clip_outliers(["cholesterol", "blood_pressure"])
ml.engineer.report()

ml.col_to_feed(["age", "sex", "cholesterol", "blood_pressure",
                "max_heart_rate", "smoking"])
ml.col_to_pred(["heart_disease"])
ml.method(["random_forest_classifier",
           "gradient_boosting_classifier",
           "logistic_regression"])
ml.split(70, 20, 10)
ml.train(n_estimators=100)
ml.show()
ml.show_graph(["confusion_matrix", "roc_curve",
               "precision_recall_curve", "model_comparison"])
ml.save("models/heart_model")

result = ml.pred(55, 1, 230, 140, 150, 1)
print(f"Heart disease: {'High risk' if result == 1 else 'Low risk'}")

Multi-class Classification — Predict a flower species

from allml import AllML

ml = AllML("flower_species.csv")

ml.col_to_feed(["sepal_length_cm", "sepal_width_cm",
                "petal_length_cm", "petal_width_cm"])
ml.col_to_pred(["species"])
ml.method("all")
ml.split(70, 20, 10)
ml.train(n_estimators=100)
ml.show()
ml.show_graph(["confusion_matrix", "feature_importance",
               "model_comparison"])
ml.save("models/flower_model")

species = ml.pred(5.1, 3.5, 1.4, 0.2)
print(f"Predicted species: {species}")

Multi-Output Regression — Predict weather conditions

from allml import AllML

ml = AllML("weather_forecast.csv")

ml.engineer.fill_missing(strategy="mean")

ml.col_to_feed(["month", "day_of_year", "humidity_pct",
                "pressure_hpa", "wind_speed_kmh",
                "cloud_cover_pct", "altitude_m"])
ml.col_to_pred(["temperature_c", "humidity_out_pct", "rainfall_mm"])
ml.method(["random_forest_regressor", "gradient_boosting_regressor"])
ml.split(70, 20, 10)
ml.train(n_estimators=150)
ml.show()
ml.show_graph(["model_comparison", "correlation_heatmap"])
ml.save("models/weather_model")

temp, humidity, rain = ml.pred(7, 200, 65.0, 1013.0, 15.0, 40.0, 100.0)
print(f"Temperature : {temp:.1f} C")
print(f"Humidity    : {humidity:.1f} %")
print(f"Rainfall    : {rain:.1f} mm")

Compare All Methods — Loan approval

from allml import AllML

ml = AllML("loan_approval.csv")

ml.engineer.fill_missing(strategy="median")
ml.engineer.clip_outliers("annual_income", method="iqr")

ml.col_to_feed(["annual_income", "loan_amount", "credit_score",
                "employment_years", "existing_debt_pct", "loan_purpose"])
ml.col_to_pred(["approved"])
ml.method("all")
ml.split(70, 20, 10)
ml.train(n_estimators=50, epochs=100)
ml.show()
ml.show_graph("model_comparison")
ml.save("models/loan_all")

Load and Predict — No CSV Required

from allml import AllML

ml = AllML()
ml.load("models/house_model.allml")

price = ml.pred(1360, 2, 1, 1981, 1, "Excellent")
print(f"Price: ${price:,.2f}")

Chained API

from allml import AllML

ml = AllML("your_data.csv")

(ml.engineer
   .fill_missing(strategy="mean")
   .drop_duplicates()
   .clip_outliers("target_column", method="iqr")
   .encode("category_column", method="label")
   .report())

(ml
   .col_to_feed(["col1", "col2", "col3", "category_column"])
   .col_to_pred(["target_column"])
   .method("random_forest_regressor")
   .split(70, 20, 10)
   .train(n_estimators=100))

ml.show()
ml.show_graph("all")
ml.save("models/my_model")

How String Columns Work

AllML fully supports string and categorical values in both feature columns and prediction inputs — no manual encoding required.

At training time — AllML detects non-numeric columns and fits a label encoder on each one automatically.

At prediction time — the same encoder is applied so raw string values like "Excellent" or "Engineering" are converted correctly.

At save and load time — all encoders are bundled inside the .allml file so they are always available when you load and predict.

The only restriction is that a value passed to ml.pred() must have been present in the training data. Passing an unseen value raises a clear error showing the valid options.

Automatic Behaviours

Behaviour	Details
Task detection	Inferred from target column type and unique value count
String column encoding	All non-numeric columns are encoded automatically
String pred inputs	Raw strings can be passed directly to `ml.pred()`
Encoder persistence	Label encoders are saved inside `.allml` files and restored on load
Feature scaling	All features are standardised before training
Target scaling	Regression targets are scaled and inverse-transformed for predictions
Multi-output wrapping	Models are wrapped in MultiOutputRegressor or MultiOutputClassifier as needed
Wrong method warning	Picking a classifier for a regression task is caught with a suggestion
NuSVC auto-tuning	`nu` is computed from the class distribution — always feasible
Parameter mapping	All hyperparameters are mapped to the correct internal argument per model
Missing value warning	AllML reports missing value counts and affected columns on CSV load

Supported Metrics

Task	Metrics
Regression	MSE, RMSE, MAE, R2
Classification	Accuracy, F1 (weighted), Precision, Recall, ROC-AUC, Confusion Matrix
Clustering	Silhouette Score, Davies-Bouldin Score

Saved File Structure

models/
  my_model.allml                              # single model bundle
  my_model_meta.json                          # human-readable summary

  my_model_random_forest_classifier.allml     # one file per method
  my_model_gradient_boosting_classifier.allml
  my_model_logistic_regression.allml
  my_model_meta.json

Notes

isotonic_regression requires exactly one feature column.
pls_regression automatically caps n_components to the number of features.
nu_svc and nu_svr automatically select a feasible nu value based on training data.
gaussian_process_regressor and gaussian_process_classifier can be slow on large datasets.
The kernel parameter applies to SVM models only.
String values in ml.pred() must match values seen during training.
Calling col_to_feed(), col_to_pred(), or split() without a CSV raises a clear error with instructions.

License

MIT License. Free to use, modify, and distribute.

Contributing

Pull requests are welcome. If a model produces unexpected errors or a parameter is not mapped correctly, open an issue with the method name, dataset shape, and the full error message.

Author

IMApurbo
GitHub: @IMApurbo

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

1.1.0

Apr 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

allml-1.1.0-py3-none-any.whl (28.3 kB view details)

Uploaded Apr 21, 2026 Python 3

File details

Details for the file allml-1.1.0-py3-none-any.whl.

File metadata

Download URL: allml-1.1.0-py3-none-any.whl
Upload date: Apr 21, 2026
Size: 28.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for allml-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8d74a0594b83060534793d8c35b73af477327542abafb8f2e5c588e46a7c5cd7`
MD5	`3bde57ccafc3c5ff7e41537d682f6597`
BLAKE2b-256	`cce3881ebe88dc63d4b2e911d15b9ddbdd0b97ae5e319a9fd0ab05cfa8e08845`

See more details on using hashes here.

allml 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AllML — All-in-One Machine Learning Library

What is AllML?

Installation

API Reference

AllML(csv_filepath=None)

ml.engineer

Fill Missing Values

Drop Columns and Rows

Encode Categorical Columns

Scale and Normalise

Handle Outliers

Create New Features

Inspect and Visualise

Chaining

ml.col_to_feed(columns)

ml.col_to_pred(columns)

ml.method(name)

All Available Methods

ml.split(train_pct, test_pct, val_pct=0)

ml.train(...)

ml.show()

ml.show_graph(graph_type)

ml.save(path)

ml.load(path)

ml.pred(*values)

Examples

Regression — Predict a house price

Binary Classification — Predict heart disease risk

Multi-class Classification — Predict a flower species

Multi-Output Regression — Predict weather conditions

Compare All Methods — Loan approval

Load and Predict — No CSV Required

Chained API

How String Columns Work

Automatic Behaviours

Supported Metrics

Saved File Structure

Notes

License

Contributing

Pull requests are welcome. If a model produces unexpected errors or a parameter is not mapped correctly, open an issue with the method name, dataset shape, and the full error message.

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

`AllML(csv_filepath=None)`

`ml.engineer`

`ml.col_to_feed(columns)`

`ml.col_to_pred(columns)`

`ml.method(name)`

`ml.split(train_pct, test_pct, val_pct=0)`

`ml.train(...)`

`ml.show()`

`ml.show_graph(graph_type)`

`ml.save(path)`

`ml.load(path)`

`ml.pred(*values)`