Skip to main content

A comprehensive utility package for machine learning development

Project description

MLON (Machine Learning Operations Network)

PyPI Downloads

A comprehensive utility package for machine learning development that works seamlessly with popular ML libraries like TensorFlow, scikit-learn, Keras, and PyTorch. MLON provides an interconnected network of operations for streamlined machine learning workflows, with production-grade safety checks and automatic ML guardrails.

⚡ Zero-Config ML Safety (New in v1.2.0!)

One line to check your entire ML pipeline:

from mlon import AutoChecker
checker = AutoChecker()
results = checker.check_data(df)  # That's it!

Or use our simple CLI:

mlon check data.csv

What You Get

  • 🔍 Automatic data leakage detection
  • ⚖️ Bias and fairness checks
  • 📊 Smart data type inference
  • 📝 Professional PDF reports
  • 💡 Actionable recommendations

Production Ready

  • ✅ Enterprise logging
  • 🛡️ Robust error handling
  • 🚀 Parallel processing
  • 📈 100% test coverage
  • 🔒 Resource management

Features Overview

1. Data Preprocessing (DataPreprocessor)

from mlon import DataPreprocessor

preprocessor = DataPreprocessor()
  • Missing Value Handling
    # Handle missing values with different strategies
    data = preprocessor.handle_missing_values(data, strategy='mean')  # Options: 'mean', 'median', 'mode', 'zero', 'drop'
    
  • Feature Scaling
    # Scale features using StandardScaler or MinMaxScaler
    scaled_data = preprocessor.scale_features(data, method='standard')  # Options: 'standard', 'minmax'
    
  • Categorical Encoding
    # Encode categorical variables
    encoded_data = preprocessor.encode_categorical(data, method='onehot')  # Options: 'onehot', 'label'
    

2. Model Evaluation (ModelEvaluator)

from mlon import ModelEvaluator

evaluator = ModelEvaluator()
  • Classification Metrics
    # Get comprehensive classification metrics
    metrics = evaluator.classification_metrics(y_true, y_pred)  # Returns accuracy, precision, recall, F1
    
  • Regression Metrics
    # Get regression performance metrics
    metrics = evaluator.regression_metrics(y_true, y_pred)  # Returns MSE, RMSE, MAE, R²
    
  • Confusion Matrix
    conf_matrix = evaluator.get_confusion_matrix(y_true, y_pred, normalize='true')
    report = evaluator.get_classification_report(y_true, y_pred)
    

3. Visualization (Visualizer)

from mlon import Visualizer

viz = Visualizer()
  • Model Performance Visualization
    # Plot confusion matrix
    viz.plot_confusion_matrix(conf_matrix, class_names=classes)
    
    # Plot learning curves
    viz.plot_learning_curve(train_scores, val_scores)
    
    # Plot feature importance
    viz.plot_feature_importance(importance_scores, feature_names)
    
  • Data Analysis Visualization
    # Plot distribution of features
    viz.plot_distribution(data['feature'])
    
    # Plot correlation matrix
    viz.plot_correlation_matrix(data)
    

4. Model Utilities (ModelUtils)

from mlon import ModelUtils

model_utils = ModelUtils()
  • Model Persistence
    # Save and load models
    model_utils.save_model(model, 'model.pkl', method='pickle')  # Options: 'pickle', 'joblib'
    model = model_utils.load_model('model.pkl', method='pickle')
    
  • Hyperparameter Tuning
    # Perform grid search
    best_model = model_utils.grid_search(model, param_grid, X, y)
    
    # Perform random search
    best_model = model_utils.random_search(model, param_dist, X, y)
    

5. Cross Validation (CrossValidator)

from mlon import CrossValidator

cv = CrossValidator(n_splits=5)
  • Cross-Validation Operations
    # Perform cross-validation with custom scoring
    scores = cv.cross_validate(model, X, y)
    
    # Get fold indices for custom cross-validation
    train_idx, val_idx = cv.get_fold_indices(X, y)
    

6. Time Series Utilities (TimeSeriesUtils) - NEW!

from mlon import TimeSeriesUtils

ts_utils = TimeSeriesUtils()
  • Sequence Creation
    # Create sequences for time series prediction
    X_seq, y_seq = ts_utils.create_sequences(data, seq_length=30, target_horizon=7)
    
  • Time Feature Engineering
    # Add time-based features
    df_with_features = ts_utils.add_time_features(df, 'date_column')
    
    # Calculate rolling statistics
    rolling_features = ts_utils.calculate_rolling_features(data, windows=[7, 30, 90])
    
    # Detect seasonality
    seasonality_period = ts_utils.detect_seasonality(data)
    

7. Automatic Guardrails (LeakageDetector, BiasDetector) - NEW in v1.1.0! 🛡️

from mlon.guardrails import LeakageDetector, BiasDetector

# Initialize detectors
leakage_detector = LeakageDetector()
bias_detector = BiasDetector()
  • Data Leakage Detection
    # Check for train-test overlap
    overlap_warnings = leakage_detector.check_train_test_overlap(X_train, X_test)
    
    # Detect target leakage in features
    leakage_warnings = leakage_detector.detect_target_leakage(X, y)
    
    # Check for future information leakage in time series
    future_warnings = leakage_detector.detect_future_leakage(timestamps, features)
    
  • Bias & Fairness Checks
    # Check for dataset bias
    bias_warnings = bias_detector.check_dataset_bias(data, protected_features=['gender', 'race'])
    
    # Calculate disparate impact
    impact_metrics = bias_detector.calculate_disparate_impact(predictions, protected_feature)
    
    # Get group fairness metrics
    fairness_metrics = bias_detector.calculate_group_fairness_metrics(y_true, y_pred, protected_feature)
    

8. Feature Selection (FeatureSelector) - NEW!

from mlon import FeatureSelector

selector = FeatureSelector()
  • Statistical Feature Selection
    # Select top k features
    X_selected, scores = selector.select_k_best(X, y, k=10, method='f_classif')
    
  • Dimensionality Reduction
    # Apply PCA
    X_pca, pca = selector.apply_pca(X, n_components=0.95)
    
    # Apply ICA
    X_ica, ica = selector.apply_ica(X, n_components=5)
    
    # Apply LDA
    X_lda, lda = selector.apply_lda(X, y, n_components=2)
    

Installation

pip install mlon

Quick Start

from mlon import DataPreprocessor, ModelEvaluator, Visualizer, ModelUtils, CrossValidator

# Initialize safety checks
leakage_detector = LeakageDetector()
bias_detector = BiasDetector()

# Check for data leakage and bias
overlap_warnings = leakage_detector.check_train_test_overlap(X_train, X_test)
leakage_warnings = leakage_detector.detect_target_leakage(X, y)
bias_warnings = bias_detector.check_dataset_bias(data, protected_features=['gender'])

# Data Preprocessing
preprocessor = DataPreprocessor()
scaled_data = preprocessor.scale_features(data, method='standard')
encoded_data = preprocessor.encode_categorical(data, method='onehot')

# Model Evaluation
evaluator = ModelEvaluator()
metrics = evaluator.classification_metrics(y_true, y_pred)
conf_matrix = evaluator.get_confusion_matrix(y_true, y_pred)

# Check model fairness
fairness_metrics = bias_detector.calculate_group_fairness_metrics(y_true, y_pred, protected_feature)

# Visualization
viz = Visualizer()
viz.plot_confusion_matrix(conf_matrix)
viz.plot_learning_curve(train_scores, val_scores)

# Model Management
model_utils = ModelUtils()
model_utils.save_model(model, 'model.pkl')
best_model = model_utils.grid_search(model, param_grid, X, y)

# Cross Validation
cv = CrossValidator(n_splits=5)
scores = cv.cross_validate(model, X, y)

Requirements

  • Python 3.7+
  • NumPy >= 1.19.0
  • Pandas >= 1.1.0
  • scikit-learn >= 0.24.0
  • Matplotlib >= 3.3.0
  • Seaborn >= 0.11.0
  • Joblib >= 1.0.0
  • SciPy >= 1.6.0
  • reportlab >= 3.6.0 # For PDF reports
  • click >= 8.0.0 # For CLI

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlon-1.2.0.tar.gz (24.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mlon-1.2.0-py3-none-any.whl (23.5 kB view details)

Uploaded Python 3

File details

Details for the file mlon-1.2.0.tar.gz.

File metadata

  • Download URL: mlon-1.2.0.tar.gz
  • Upload date:
  • Size: 24.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for mlon-1.2.0.tar.gz
Algorithm Hash digest
SHA256 3ff4c9ab9fda6fc2373a59fd6dcee0a7681aec5791edc1a7bc3bf11496b2e016
MD5 7eb8f337d2fdff1b6aec4152e891aad0
BLAKE2b-256 5d5b80ed3d09c0c73bbc864c8b1c1cde3b554cfe09a79706d7096a12e920beee

See more details on using hashes here.

File details

Details for the file mlon-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: mlon-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 23.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for mlon-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7573051b40e352d0003630a00ca52a480423ff00a9b2111f7c5abf72b8d85187
MD5 bf0d967e0246e51ff8a12863dd12e0c9
BLAKE2b-256 8aa50598c1041a04163f4de1d74768e4f2d661307dd5ca7daca662e67df5e35f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page