Skip to main content

Templates to work with models for classification, regression and clustering with xgboost and sklearn.

Project description



Sinapsis Data Analysis

Module for machine learning model training, analysis, and inference, using the Scikit-learn and XGBoost libraries.

🐍 Installation 🚀 Features 📚 Usage Example📙 Documentation 🔍 License

Sinapsis Data Analysis provides a comprehensive set of tools for machine learning model training, evaluation, and inference using industry-standard libraries like scikit-learn and XGBoost.

🐍 Installation

Install using your package manager of choice. We encourage the use of uv

Example with uv:

  uv pip install sinapsis-data-analysis --extra-index-url https://pypi.sinapsis.tech

or with raw pip:

  pip install sinapsis-data-analysis --extra-index-url https://pypi.sinapsis.tech

🚀 Features

Templates Supported

Sinapsis Data Analysis provides a variety of templates for machine learning workflows:

Scikit-Learn Models

The following model types are supported:

  • Linear Models: LinearRegression, Ridge, Lasso, ElasticNet, LogisticRegression, etc.
  • Neighbors Models: KNeighborsClassifier, KNeighborsRegressor, RadiusNeighborsClassifier, etc.
  • Neural Network Models: MLPClassifier, MLPRegressor, BernoulliRBM
  • SVM Models: SVC, SVR, LinearSVC, LinearSVR, NuSVC, NuSVR, OneClassSVM, etc.
  • Tree Models: DecisionTreeClassifier, DecisionTreeRegressor, ExtraTreeClassifier, etc.

Each template uses the same base attributes:

  • generic_field_key (str, required): Key of the generic field where datasets are stored
  • model_save_path (str, required): Path where the trained model will be saved
XGBoost Models

XGBoost model templates include:

  • XGBClassifier
  • XGBRegressor
  • XGBRanker
  • XGBRFClassifier
  • XGBRFRegressor
  • Booster

Attributes are the same as those for Scikit-learn templates.

Manifold Learning

Templates for dimensionality reduction using scikit-learn's manifold learning techniques:

  • SKLearnManifold: Base class for all manifold learning algorithms
    • generic_field_key (str, required): Key of the generic field where the input data is stored

Specific algorithms include t-SNE, MDS, Isomap, LocallyLinearEmbedding, and more.

Inference Templates

Templates for using trained models to make predictions on new data:

  • SKLearnInference: For inference with scikit-learn models
  • XGBoostInference: For inference with XGBoost models

To use these templates, you should replace the model_path to point to the path of the trained model.

[!TIP] Use CLI command sinapsis info --all-template-names to show a list with all the available Template names installed with Sinapsis Data Analysis.

[!TIP] Use CLI command sinapsis info --example-template-config TEMPLATE_NAME to produce an example Agent config for the Template specified in TEMPLATE_NAME.

For example, for LinearRegression use sinapsis info --example-template-config LinearRegression to produce an example config.

📚 Usage Example

Below is an example configuration for **Sinapsis Data Analysis** using LinearRegressionWrapper for regression.
Example config
agent:
  name: sklearn_linear_models_agent
  description: agent to train a LinearRegression model from scikit-learn using the load_diabetes dataset

templates:
- template_name: InputTemplate
  class_name: InputTemplate
  attributes: {}

- template_name: load_diabetesWrapper
  class_name: load_diabetesWrapper
  template_input: InputTemplate
  attributes:
    split_dataset: true
    train_size: 0.8
    load_diabetes:
      return_X_y: false
      as_frame: true

- template_name: LinearRegressionWrapper
  class_name: LinearRegressionWrapper
  template_input: load_diabetesWrapper
  attributes:
    generic_field_for_data: load_diabetesWrapper
    model_save_path: "artifacts/linear_regression.joblib"
    linearregression_init:
      fit_intercept: true
      copy_X: true
      n_jobs: null
      positive: false

To run the config, use the CLI:

sinapsis run name_of_config.yml

📙 Documentation

Documentation for this and other sinapsis packages is available on the sinapsis website

Tutorials for different projects within sinapsis are available at sinapsis tutorials page

🔍 License

This project is licensed under the AGPLv3 license, which encourages open collaboration and sharing. For more details, please refer to the LICENSE file.

For commercial use, please refer to our official Sinapsis website for information on obtaining a commercial license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinapsis_data_analysis-0.1.14.tar.gz (25.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sinapsis_data_analysis-0.1.14-py3-none-any.whl (30.1 kB view details)

Uploaded Python 3

File details

Details for the file sinapsis_data_analysis-0.1.14.tar.gz.

File metadata

File hashes

Hashes for sinapsis_data_analysis-0.1.14.tar.gz
Algorithm Hash digest
SHA256 03e7b4e924e58fcffb845df30c753556353612344400ab708e5038f0c2e04a03
MD5 208fdd943c96b6089d5772eb43053fd9
BLAKE2b-256 3697e6c1932d46f8cbe6681bddb425105b2755787b521eb66f8883f3de06e0c9

See more details on using hashes here.

File details

Details for the file sinapsis_data_analysis-0.1.14-py3-none-any.whl.

File metadata

File hashes

Hashes for sinapsis_data_analysis-0.1.14-py3-none-any.whl
Algorithm Hash digest
SHA256 dbd176c272b2d4314e2e6256c8e471477c509fb5cc57c36f9b7c5ffaa5d5ccbe
MD5 5a0f5be316d4b65b30a5f70c28236389
BLAKE2b-256 c74dd84856b7df6460be0d4a2a683cb5cf3646a35cc4ecd953c0fe2fe9b83c0c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page