Skip to main content

Neural Network Dataset

Project description

Neural Network Dataset


short alias lmur

LEMUR - Learning, Evaluation, and Modeling for Unified Research

The original version of the LEMUR dataset was created by Arash Torabi Goodarzi, Roman Kochnev and Zofia Antonina Bentyn at the Computer Vision Laboratory, University of Würzburg, Germany.

Overview 📖

The primary goal of NN Dataset project is to provide flexibility for dynamically combining various deep learing tasks, datasets, metrics, and neural network models. It is designed to facilitate the verification of neural network performance under various combinations of training hyperparameters and data transformation algorithms, by automatically generating performance statistics. It is primarily developed to support the NN GPT project.

Create and Activate a Virtual Environment (recommended):

python -m venv .venv
source .venv/bin/activate   # Linux/Mac
.venv\Scripts\activate      # Windows

All subsequent commands are provided for Linux/Mac OS. For Windows, please replace source .venv/bin/activate with .venv\Scripts\activate.

Installation or Update of the NN Dataset

Remove old version of the LEMUR Dataset and its database:

source .venv/bin/activate
pip uninstall nn-dataset -y
rm -rf db

Installing the stable version:

source .venv/bin/activate
pip install nn-dataset --upgrade --extra-index-url https://download.pytorch.org/whl/cu124

Installing from GitHub to get the most recent code and statistics updates:

source .venv/bin/activate
pip install git+https://github.com/ABrain-One/nn-dataset --upgrade --force --extra-index-url https://download.pytorch.org/whl/cu124

Adding functionality to export data to Excel files and generate plots for analyzing neural network performance:

source .venv/bin/activate
pip install nn-stat --upgrade --extra-index-url https://download.pytorch.org/whl/cu124

and export/generate:

source .venv/bin/activate
python -m ab.stat.export

Usage

Standard use cases:

  1. Add a new neural network model into the ab/nn/nn directory.
  2. Run the automated training process for this model (e.g., a new ComplexNet training pipeline configuration):
source .venv/bin/activate
python -m ab.nn.train -c img-classification_cifar-10_acc_ComplexNet

or for all image segmentation models using a fixed range of training parameters and transformer:

source .venv/bin/activate
python run.py -c img-segmentation -f echo --min_learning_rate 1e-4 -l 1e-2 --min_momentum 0.8 -m 0.99 --min_batch_binary_power 2 -b 6

To reproduce the previous result, set the minimum and maximum to the same desired values:

source .venv/bin/activate
python run.py -c img-classification_cifar-10_acc_AlexNet --min_learning_rate 0.0061 -l 0.0061 --min_momentum 0.7549 -m 0.7549 --min_batch_binary_power 2 -b 2 -f norm_299

To view supported flags:

python run.py -h

Docker

All versions of this project are compatible with AI Linux and can be run inside a Docker image:

docker run -v /a/mm:. abrainone/ai-linux bash -c "PYTHONPATH=/a/mm python -m ab.nn.train"

Some recently added dependencies might be missing in the AI Linux. In this case, you can create a container from the Docker image abrainone/ai-linux, install the missing packages (preferably using pip install <package name>), and then create a new image from the container using docker commit <container name> <new image name>. You can use this new image locally or push it to the registry for deployment on the computer cluster.

Environment for NN Dataset Contributors

Pip package manager

Create a virtual environment, activate it, and run the following command to install all the project dependencies:

source .venv/bin/activate
python -m pip install --upgrade pip
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu124

Contribution

To contribute a new neural network (NN) model to the NN Dataset, please ensure the following criteria are met:

  1. The code for each model is provided in a respective ".py" file within the /ab/nn/nn directory, and the file is named after the name of the model's structure.
  2. The main class for each model is named Net.
  3. The constructor of the Net class takes the following parameters:
    • in_shape (tuple): The shape of the first tensor from the dataset iterator. For images it is structured as (batch, channel, height, width).
    • out_shape (tuple): Provided by the dataset loader, it describes the shape of the output tensor. For a classification task, this could be (number of classes,).
    • prm (dict): A dictionary of hyperparameters, e.g., {'lr': 0.24, 'momentum': 0.93, 'dropout': 0.51}.
    • device (torch.device): PyTorch device used for the model training
  4. All external information required for the correct building and training of the NN model for a specific dataset/transformer, as well as the list of hyperparameters, is extracted from in_shape, out_shape or prm, e.g.:
    batch = in_shape[0]
    channel_number = in_shape[1]
    image_size = in_shape[2]
    class_number = out_shape[0]
    learning_rate = prm['lr']
    momentum = prm['momentum']
    dropout = prm['dropout'].
  5. Every model script has function returning set of supported hyperparameters, e.g.:
    def supported_hyperparameters(): return {'lr', 'momentum', 'dropout'}
    The value of each hyperparameter lies within the range of 0.0 to 1.0.
  6. Every class Net implements two functions:
    train_setup(self, prm)
    and
    learn(self, train_data)
    The first function initializes the criteria and optimizer, while the second implements the training pipeline. See a simple implementation in the AlexNet model.
  7. For each pull request involving a new NN model, please generate and submit training statistics for 100 Optuna trials (or at least 3 trials for very large models) in the ab/nn/stat directory. The trials should cover 5 epochs of training. Ensure that this statistics is included along with the model in your pull request. For example, the statistics for the ComplexNet model are stored in files <epoch number>.json inside folder img-classification_cifar-10_acc_ComplexNet, and can be generated by:
python run.py -c img-classification_cifar-10_acc_ComplexNet -t 100 -e 5

See more examples of models in /ab/nn/nn and generated statistics in /ab/nn/stat.

Available Modules

The nn-dataset package includes the following key modules:

  1. Dataset:

    • Predefined neural network architectures such as AlexNet, ResNet, VGG, and more.
    • Located in ab.nn.nn.
  2. Loaders:

    • Data loaders for datasets such as CIFAR-10 and COCO.
    • Located in ab.nn.loader.
  3. Metrics:

    • Common evaluation metrics like accuracy and IoU.
    • Located in ab.nn.metric.
  4. Utilities:

    • Helper functions for training and statistical analysis.
    • Located in ab.nn.util.

Citation

If you find the LEMUR Neural Network Dataset to be useful for your research, please consider citing:

@misc{ABrain-One.NN-Dataset,
  author       = {Goodarzi, Arash Torabi and Kochnev, Roman and Khalid, Waleed and Qin, Furui and Uzun, Tolgay Atinc and Kathiriya, Yash Kanubhai and Dhameliya, Yashkumar Sanjaybhai and Bentyn, Zofia Antonina and Ignatov, Dmitry and Timofte, Radu},
  title        = {LEMUR Neural Network Dataset: Towards Seamless AutoML},
  howpublished = {\url{https://github.com/ABrain-One/nn-dataset}},
  year         = {2024},
}

Licenses

This project is distributed under the following licensing terms:

  • for neural network models adopted from other projects
  • all neural network models and their weights not covered by the above licenses, as well as all other files and assets in this project, are subject to the MIT license

The idea of Dr. Dmitry Ignatov

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lmur-1.1.0.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lmur-1.1.0-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file lmur-1.1.0.tar.gz.

File metadata

  • Download URL: lmur-1.1.0.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for lmur-1.1.0.tar.gz
Algorithm Hash digest
SHA256 3b7e8b75c0ca7ea1c5778ea54faa612f170d0b9d4a441dfc47c142702312a212
MD5 4809815e728323013f2087ba12a07d64
BLAKE2b-256 4309ca383d2216ca3b7a5ee664c6e257dcc97adbc1a3ac96287ab447d9cda750

See more details on using hashes here.

File details

Details for the file lmur-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: lmur-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.9

File hashes

Hashes for lmur-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 218eebed405459592b5787ecb985b96dc86ff76d45785e6f1b201d69d7b6eeb5
MD5 6971c669ae3032cef61eecc8c711b9f6
BLAKE2b-256 870fae66b708076aba853e8fbcad9bd39a73ad0450386c19196a2bf19f0ae5d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page