Skip to main content

A comprehensive utility package for machine learning development

Project description

MLON (Machine Learning Operations Network)

PyPI Downloads

A comprehensive utility package for machine learning development that works seamlessly with popular ML libraries like TensorFlow, scikit-learn, Keras, and PyTorch. MLON provides an interconnected network of operations for streamlined machine learning workflows, with production-grade safety checks and automatic ML guardrails.

⚡ Zero-Config ML Safety (New in v1.2.0!)

One line to check your entire ML pipeline:

from mlon import AutoChecker
checker = AutoChecker()
results = checker.check_data(df)

Or use our simple CLI:

mlon check data.csv

What You Get

  • Automatic data leakage detection
  • Bias and fairness checks
  • Smart data type inference
  • Professional PDF reports
  • Actionable recommendations

Production Ready

  • Enterprise logging
  • Robust error handling
  • Parallel processing
  • 100% test coverage
  • Resource management

Features Overview

1. Data Preprocessing (DataPreprocessor)

from mlon import DataPreprocessor

preprocessor = DataPreprocessor()
  • Missing Value Handling
    # Handle missing values with different strategies
    data = preprocessor.handle_missing_values(data, strategy='mean')  # Options: 'mean', 'median', 'mode', 'zero', 'drop'
    
  • Feature Scaling
    # Scale features using StandardScaler or MinMaxScaler
    scaled_data = preprocessor.scale_features(data, method='standard')  # Options: 'standard', 'minmax'
    
  • Categorical Encoding
    # Encode categorical variables
    encoded_data = preprocessor.encode_categorical(data, method='onehot')  # Options: 'onehot', 'label'
    

2. Model Evaluation (ModelEvaluator)

from mlon import ModelEvaluator

evaluator = ModelEvaluator()
  • Classification Metrics
    # Get comprehensive classification metrics
    metrics = evaluator.classification_metrics(y_true, y_pred)  # Returns accuracy, precision, recall, F1
    
  • Regression Metrics
    # Get regression performance metrics
    metrics = evaluator.regression_metrics(y_true, y_pred)  # Returns MSE, RMSE, MAE, R²
    
  • Confusion Matrix
    conf_matrix = evaluator.get_confusion_matrix(y_true, y_pred, normalize='true')
    report = evaluator.get_classification_report(y_true, y_pred)
    

3. Visualization (Visualizer)

from mlon import Visualizer

viz = Visualizer()
  • Model Performance Visualization
    # Plot confusion matrix
    viz.plot_confusion_matrix(conf_matrix, class_names=classes)
    
    # Plot learning curves
    viz.plot_learning_curve(train_scores, val_scores)
    
    # Plot feature importance
    viz.plot_feature_importance(importance_scores, feature_names)
    
  • Data Analysis Visualization
    # Plot distribution of features
    viz.plot_distribution(data['feature'])
    
    # Plot correlation matrix
    viz.plot_correlation_matrix(data)
    

4. Model Utilities (ModelUtils)

from mlon import ModelUtils

model_utils = ModelUtils()
  • Model Persistence
    # Save and load models
    model_utils.save_model(model, 'model.pkl', method='pickle')  # Options: 'pickle', 'joblib'
    model = model_utils.load_model('model.pkl', method='pickle')
    
  • Hyperparameter Tuning
    # Perform grid search
    best_model = model_utils.grid_search(model, param_grid, X, y)
    
    # Perform random search
    best_model = model_utils.random_search(model, param_dist, X, y)
    

5. Cross Validation (CrossValidator)

from mlon import CrossValidator

cv = CrossValidator(n_splits=5)
  • Cross-Validation Operations
    # Perform cross-validation with custom scoring
    scores = cv.cross_validate(model, X, y)
    
    # Get fold indices for custom cross-validation
    train_idx, val_idx = cv.get_fold_indices(X, y)
    

6. Time Series Utilities (TimeSeriesUtils) - NEW!

from mlon import TimeSeriesUtils

ts_utils = TimeSeriesUtils()
  • Sequence Creation
    # Create sequences for time series prediction
    X_seq, y_seq = ts_utils.create_sequences(data, seq_length=30, target_horizon=7)
    
  • Time Feature Engineering
    # Add time-based features
    df_with_features = ts_utils.add_time_features(df, 'date_column')
    
    # Calculate rolling statistics
    rolling_features = ts_utils.calculate_rolling_features(data, windows=[7, 30, 90])
    
    # Detect seasonality
    seasonality_period = ts_utils.detect_seasonality(data)
    

7. Automatic Guardrails (LeakageDetector, BiasDetector) - NEW in v1.1.0! 🛡️

from mlon.guardrails import LeakageDetector, BiasDetector

# Initialize detectors
leakage_detector = LeakageDetector()
bias_detector = BiasDetector()
  • Data Leakage Detection
    # Check for train-test overlap
    overlap_warnings = leakage_detector.check_train_test_overlap(X_train, X_test)
    
    # Detect target leakage in features
    leakage_warnings = leakage_detector.detect_target_leakage(X, y)
    
    # Check for future information leakage in time series
    future_warnings = leakage_detector.detect_future_leakage(timestamps, features)
    
  • Bias & Fairness Checks
    # Check for dataset bias
    bias_warnings = bias_detector.check_dataset_bias(data, protected_features=['gender', 'race'])
    
    # Calculate disparate impact
    impact_metrics = bias_detector.calculate_disparate_impact(predictions, protected_feature)
    
    # Get group fairness metrics
    fairness_metrics = bias_detector.calculate_group_fairness_metrics(y_true, y_pred, protected_feature)
    

8. Feature Selection (FeatureSelector) - NEW!

from mlon import FeatureSelector

selector = FeatureSelector()
  • Statistical Feature Selection
    # Select top k features
    X_selected, scores = selector.select_k_best(X, y, k=10, method='f_classif')
    
  • Dimensionality Reduction
    # Apply PCA
    X_pca, pca = selector.apply_pca(X, n_components=0.95)
    
    # Apply ICA
    X_ica, ica = selector.apply_ica(X, n_components=5)
    
    # Apply LDA
    X_lda, lda = selector.apply_lda(X, y, n_components=2)
    

Installation

pip install mlon

Quick Start

from mlon import DataPreprocessor, ModelEvaluator, Visualizer, ModelUtils, CrossValidator

# Initialize safety checks
leakage_detector = LeakageDetector()
bias_detector = BiasDetector()

# Check for data leakage and bias
overlap_warnings = leakage_detector.check_train_test_overlap(X_train, X_test)
leakage_warnings = leakage_detector.detect_target_leakage(X, y)
bias_warnings = bias_detector.check_dataset_bias(data, protected_features=['gender'])

# Data Preprocessing
preprocessor = DataPreprocessor()
scaled_data = preprocessor.scale_features(data, method='standard')
encoded_data = preprocessor.encode_categorical(data, method='onehot')

# Model Evaluation
evaluator = ModelEvaluator()
metrics = evaluator.classification_metrics(y_true, y_pred)
conf_matrix = evaluator.get_confusion_matrix(y_true, y_pred)

# Check model fairness
fairness_metrics = bias_detector.calculate_group_fairness_metrics(y_true, y_pred, protected_feature)

# Visualization
viz = Visualizer()
viz.plot_confusion_matrix(conf_matrix)
viz.plot_learning_curve(train_scores, val_scores)

# Model Management
model_utils = ModelUtils()
model_utils.save_model(model, 'model.pkl')
best_model = model_utils.grid_search(model, param_grid, X, y)

# Cross Validation
cv = CrossValidator(n_splits=5)
scores = cv.cross_validate(model, X, y)

Requirements

  • Python 3.7+
  • NumPy >= 1.19.0
  • Pandas >= 1.1.0
  • scikit-learn >= 0.24.0
  • Matplotlib >= 3.3.0
  • Seaborn >= 0.11.0
  • Joblib >= 1.0.0
  • SciPy >= 1.6.0
  • reportlab >= 3.6.0 # For PDF reports
  • click >= 8.0.0 # For CLI

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlon-1.2.1.tar.gz (24.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlon-1.2.1-py3-none-any.whl (23.4 kB view details)

Uploaded Python 3

File details

Details for the file mlon-1.2.1.tar.gz.

File metadata

  • Download URL: mlon-1.2.1.tar.gz
  • Upload date:
  • Size: 24.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for mlon-1.2.1.tar.gz
Algorithm Hash digest
SHA256 f81fa697531d981472c90abee88c094990caf25a8dcc0318fc5abf3ff11c811f
MD5 4ec1eee8ebf7b254a26ba78fe84385ad
BLAKE2b-256 09bc4511bf5d1eaee2b2d27f19ae423401ee1ff856da5ed4465684abb50e6fc9

See more details on using hashes here.

File details

Details for the file mlon-1.2.1-py3-none-any.whl.

File metadata

  • Download URL: mlon-1.2.1-py3-none-any.whl
  • Upload date:
  • Size: 23.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for mlon-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bff32381bceb36c7b9da75e8dea17d9ad2d41a64f1806a3b2e3f0a6983ce3bbb
MD5 4201427b2a466d845b7a98962e133082
BLAKE2b-256 f12fd84f3759be6b621a0fc9af817df5f5411a339e7e44f7028f22f4417f1855

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page