dreamml

Framework for creating, running and validation of ML models on tabular data

Project description

DreamML - Self Machine Learning ❤️

The next stage of evalution DS-Template

DreamML_promo

About the DreamML

DreamML is a machine learning framework aimed at the industrial process. The main task is to choose a simple model, taking into account the balance of complexity, quality and metrics. We also suggest reviewing the quality of the models in special development reports, and for some tasks, a validation report created using the central bank's methodology.

*This is the first cycle of the project's release into open source, then we plan to publish more materials and improve the framework.

Installation

📦 Python Package

pip install dreamml

📂 Repository

Step 1: Install Anaconda or Python 3.8

Step 2: Create environment

Anaconda conda create --name dreamml_env python=3.8
Python 3.8 python -m venv dreamml_env

Step 3: Activate environment

Anaconda conda activate dreamml_env
Python source dreamml_env/bin/activate

Step 4: Clone the repository and go to the dreamml root folder

git clone https://gitverse.ru/dreamml/DreamML.git
cd DreamML

Step 5: Install dreamml in your environment

pip install -e .

🐳 Docker

git clone https://gitverse.ru/dreamml/DreamML.git
cd DreamML
docker build -t dreamml:v3.5.4 .
docker run -d -p 8888:8888 -v $(pwd):/app --name dreamml_container dreamml:v3.5.4

(!) If ${pwd} does not work (for example, in older versions of PowerShell), use the absolute path:

docker run -d -p 8888:8888 -v C:\path\to\DreamML:/app --name dreamml_container dreamml:v3.5.4

Then go to http://localhost:8888

Get started

To develop a model, you can use the notebooks located in the notebooks/1. Model Development and select the one you need depending on the type of your task.

To validate models, you can use the notebooks located in the notebooks/2. Validate Model

To calibration models, you can use the notebooks located in the notebooks/3. Calibration

How to Use

Information on notebooks for development `notebooks/1. Model Development`

First, you need to determine the pipeline configuration
- For regression, binary, multiclass, multilabel tasks you can refer to this document docs/1_Model_Development_doc.md
- For topic_modeling task you can refer to this document docs/1_Topic_Modeling_doc.md
- For timeseries with (boosting) task you can refer to this document docs/1_TimeSeries_doc.md
- For amts with (Prophet) task you can refer to this document docs/1_AltModeTimeSeries_forecast.md
- If your dataset contains text features you should refer to this document docs/1_NLP_text_classification_doc.md
- If you would like to learn more about quality metrics and loss functions, we recommend that you refer to the document docs/Binary_Classification_Metrics_doc.md
You should start building the configuration and preparing the data for modeling

config_storage = ConfigStorage(config=config)
transformer = DataTransformer(config_storage)
data_storage = transformer.transform()

Next, you should run the simulation pipeline

pipeline = MainPipeline(config_storage=config_storage, data_storage=data_storage)
pipeline.transform()

For some tasks, you can also use Light Auto M L as a model and calculate out of time potential

lama = add_lama_model(data_storage.get_eval_set(), config_storage)
oot_potential = calculate_oot_metrics(data_storage.get_eval_set(), config_storage)

You can also start the process of saving simulation artifacts if you need it

saver = pipeline.artifact_saver
models = pipeline.prepared_model_dict
pipeline.oot_potential = oot_potential
models.update(lama)
nb_name = saver.get_notebook_path_and_save()
saver.save_artifacts(
    models=models,
    other_models=pipeline.other_model_dict,
    encoder=transformer.cat_transformer,
    ipynb_name=nb_name,
    feature_threshold=config_storage.feature_threshold,
)
saver.save_data(data=data_storage.get_eval_set(), dropped_data=data_storage.get_dropped_data())

At the end, we can generate a development report. By default, it will be saved to the dreamml/results folder.

get_report(pipeline=pipeline, config_storage=config_storage, data_storage=data_storage, encoder=transformer.cat_transformer)

Authors

Author	Email
Nikita Buts	nikitabuts2000@gmail.com
Alexander Izyurov	halfbrick845@gmail.com
Ivan Plotnikov	com.gateway.api@gmail.com
Maidari Tsydenov	maidaritsydenov@gmail.com
Evgeny Tkachenko	e_t@inbox.ru
Ilya Ivanov	morwes4@gmail.com
Nikita Varganov	-

LICENSE

This project is licensed under the Apache License, Version 2.0. See LICENSE for details.

Project details

Release history Release notifications | RSS feed

3.6.3

Jun 4, 2025

3.5.4.1

Feb 4, 2025

This version

3.5.4

Feb 2, 2025

3.5.3

Jan 30, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dreamml-3.5.4.tar.gz (321.4 kB view details)

Uploaded Feb 2, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dreamml-3.5.4-py3-none-any.whl (449.0 kB view details)

Uploaded Feb 2, 2025 Python 3

File details

Details for the file dreamml-3.5.4.tar.gz.

File metadata

Download URL: dreamml-3.5.4.tar.gz
Upload date: Feb 2, 2025
Size: 321.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.20

File hashes

Hashes for dreamml-3.5.4.tar.gz
Algorithm	Hash digest
SHA256	`965a94073b8456c2921cd1c9aec822ef2ed1dc631efb51f7a035764bc47a702b`
MD5	`eccab00cbde39c29a9e132b375bb7e05`
BLAKE2b-256	`d8219f5f02df2f11012ff8fcd3dd0595a7d15f46e518a5c3cafab88bb8588cf8`

See more details on using hashes here.

File details

Details for the file dreamml-3.5.4-py3-none-any.whl.

File metadata

Download URL: dreamml-3.5.4-py3-none-any.whl
Upload date: Feb 2, 2025
Size: 449.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.20

File hashes

Hashes for dreamml-3.5.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6ae94b883cd71aff4e1e639f3aa53c409af812062f2954c1d55f2fc36577cdb8`
MD5	`02224afb8edba3de30d06e16ecf741ce`
BLAKE2b-256	`2da0fa4d2f6392da277b3e126d36f08871ac1a3a97160f248bb6babd654d3265`

See more details on using hashes here.

dreamml 3.5.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

DreamML - Self Machine Learning ❤️

The next stage of evalution DS-Template

About the DreamML

Installation

📦 Python Package

📂 Repository

🐳 Docker

Get started

How to Use

Information on notebooks for development `notebooks/1. Model Development`

Authors

LICENSE

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

dreamml 3.5.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

DreamML - Self Machine Learning ❤️

The next stage of evalution DS-Template

About the DreamML

Installation

📦 Python Package

📂 Repository

🐳 Docker

Get started

How to Use

Information on notebooks for development notebooks/1. Model Development

Authors

LICENSE

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Information on notebooks for development `notebooks/1. Model Development`