nn-dataset

Neural Network Dataset

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Neural Network Dataset

_{short alias lmur}

LEMUR - Learning, Evaluation, and Modeling for Unified Research

The original version of the LEMUR dataset was created by Arash Torabi Goodarzi, Roman Kochnev and Zofia Antonina Bentyn at the Computer Vision Laboratory, University of Würzburg, Germany.

Overview 📖

NN Dataset project provides flexibility for dynamically combining various deep learing tasks, datasets, metrics, and neural network models. It is designed to facilitate the verification of neural network performance under various combinations of training hyperparameters and data transformation algorithms, by automatically generating performance statistics. Developed to support the NNGPT project, this dataset contains neural network models modified or generated by NNGPT's large language models, with names featuring alphanumeric postfixes (e.g., AlexNet-bb84fa5d5dd8).

Create and Activate a Virtual Environment (recommended)

For Linux/Mac:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip

For Windows:

python3 -m venv .venv
.venv\Scripts\activate
python -m pip install --upgrade pip

It is assumed that CUDA 12.6 is installed; otherwise, consider replacing 'cu126' with the appropriate version. Some neural network training tasks require GPUs with at least 24 GB of memory.

Installation or Update of the NN Dataset with pip

Remove an old version of the LEMUR Dataset and its database:

pip uninstall nn-dataset -y
rm -rf db

Installing the stable version:

pip install nn-dataset --upgrade --extra-index-url https://download.pytorch.org/whl/cu126

Installing from GitHub to get the most recent code and statistics updates:

pip install git+https://github.com/ABrain-One/nn-dataset --upgrade --force --extra-index-url https://download.pytorch.org/whl/cu126

Adding functionality to export data to Excel files and generate plots for analyzing neural network performance:

pip install nn-dataset[stat] --upgrade --extra-index-url https://download.pytorch.org/whl/cu126

and export/generate:

python -m ab.stat.export

Environment for NN Dataset Contributors

Pip package manager

Create a virtual environment, activate it, and run the following command to install all the project dependencies:

python -m pip install --upgrade pip
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu126

Usage

Standard use cases:

Add a new neural network model into the ab/nn/nn directory.
Run the automated training process for this model (e.g., a new ComplexNet training pipeline configuration):

python -m ab.nn.train -c img-classification_cifar-10_acc_ComplexNet

or for all image segmentation models using a fixed range of training parameters and transformer:

. train.sh -c img-segmentation -f echo --min_learning_rate 1e-4 -l 1e-2 --min_momentum 0.8 -m 0.99 --min_batch_binary_power 2 -b 6

train.sh internally calls ab.nn.train, offering a shorter way to run the program. Both scripts accept the same input flags and can be used interchangeably.

Reproducing Results with Fixed Training Parameters

To reproduce previously obtained results, provide fixed values for the training parameters in JSON format. The parameter names should match those returned by the supported_hyperparameters() function of the NN model.

Example command:

. train.sh -c img-classification_cifar-10_acc_ComplexNet -f complex -p '{"lr": 0.017, "momentum": 0.022 , "batch": 32}'

where:

-c specifies the training configuration,

-f selects the preprocessing pipeline,

-p sets the hyperparameters explicitly (e.g., learning rate, momentum, batch size) using a JSON string.

To view supported flags:

. train.sh -h

Docker

All versions of this project are compatible with AI Linux and can be seamlessly executed within the AI Linux Docker container.

Example of training LEMUR neural network within the AI Linux container (Linux host):

Installing the latest version of the project from GitHub

docker run --rm -u $(id -u):ab -v $(pwd):/a/mm abrainone/ai-linux bash -c "[ -d nn-dataset ] && git -C nn-dataset pull || git -c advice.detachedHead=false clone --depth 1 https://github.com/ABrain-One/nn-dataset"

Running script

docker run --rm -u $(id -u):ab --shm-size=16G -v $(pwd)/nn-dataset:/a/mm abrainone/ai-linux bash -c ". train.sh -c img-classification_cifar-10_acc_ComplexNet -f complex -l 0.017 --min_learning_rate 0.013 -m 0.025 --min_momentum 0.022 -b 9 --min_batch_binary_power 9"

If recently added dependencies are missing in the AI Linux, you can create a container from the Docker image abrainone/ai-linux, install the missing packages (preferably using pip install <package name>), and then create a new image from the container using docker commit <container name> <new image name>. You can use this new image locally or push it to the registry for deployment on the computer cluster.

Contribution

To contribute a new neural network (NN) model to the NN Dataset, please ensure the following criteria are met:

The code for each model is provided in a respective ".py" file within the /ab/nn/nn directory, and the file is named after the name of the model's structure.
The main class for each model is named Net.
The constructor of the Net class takes the following parameters:
- in_shape (tuple): The shape of the first tensor from the dataset iterator. For images it is structured as (batch, channel, height, width).
- out_shape (tuple): Provided by the dataset loader, it describes the shape of the output tensor. For a classification task, this could be (number of classes,).
- prm (dict): A dictionary of hyperparameters, e.g., {'lr': 0.24, 'momentum': 0.93, 'dropout': 0.51}.
- device (torch.device): PyTorch device used for the model training
All external information required for the correct building and training of the NN model for a specific dataset/transformer, as well as the list of hyperparameters, is extracted from in_shape, out_shape or prm, e.g.:
batch = in_shape[0]
channel_number = in_shape[1]
image_size = in_shape[2]
class_number = out_shape[0]
learning_rate = prm['lr']
momentum = prm['momentum']
dropout = prm['dropout'].
Every model script has function returning set of supported hyperparameters, e.g.:
def supported_hyperparameters(): return {'lr', 'momentum', 'dropout'}
The value of each hyperparameter lies within the range of 0.0 to 1.0.
Every class Net implements two functions:
train_setup(self, prm)
and
learn(self, train_data)
The first function initializes the criteria and optimizer, while the second implements the training pipeline. See a simple implementation in the AlexNet model.
For each pull request involving a new NN model, please generate and submit training statistics for 100 Optuna trials (or at least 3 trials for very large models) in the ab/nn/stat directory. The trials should cover 5 epochs of training. Ensure that this statistics is included along with the model in your pull request. For example, the statistics for the ComplexNet model are stored in files <epoch number>.json inside folder img-classification_cifar-10_acc_ComplexNet, and can be generated by:

python run.py -c img-classification_cifar-10_acc_ComplexNet -t 100 -e 5

See more examples of models in /ab/nn/nn and generated statistics in /ab/nn/stat.

Available Modules

The NN Dataset includes the following key modules within the ab.nn package:

nn: Predefined neural network architectures, including models like AlexNet, ResNet, VGG, and more.
loader: Data loading utilities for popular datasets such as CIFAR-10, COCO, and others.
metric: Evaluation metrics supported for model assessment, such as accuracy, Intersection over Union (IoU), and others.
transform: A collection of data transformation algorithms for dataset preprocessing and augmentation.
stat: Statistics for different neural network model training pipelines.
util: Utility functions designed to assist with training, model evaluation, and statistical analysis.

Citation

If you find the LEMUR Neural Network Dataset to be useful for your research, please consider citing our article:

@article{ABrain.NN-Dataset,
  title={LEMUR Neural Network Dataset: Towards Seamless AutoML},
  author={Goodarzi, Arash Torabi and Kochnev, Roman and Khalid, Waleed and Qin, Furui and Uzun, Tolgay Atinc and Dhameliya, Yashkumar Sanjaybhai and Kathiriya, Yash Kanubhai and Bentyn, Zofia Antonina and Ignatov, Dmitry and Timofte, Radu},
  journal={arXiv preprint arXiv:2504.10552},
  year={2025}
}

Licenses

This project is distributed under the following licensing terms:

for neural network models adopted from other projects
- Python code under the legacy MIT or BSD 3-Clause license
- models with pretrained weights under the legacy DeepSeek LLM V2 license
all neural network models and their weights not covered by the above licenses, as well as all other files and assets in this project, are subject to the MIT license

The idea and leadership of Dr. Ignatov

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

2.2.8

Mar 25, 2026

2.2.7

Feb 12, 2026

2.2.6

Jan 11, 2026

2.1.2

Nov 15, 2025

2.1.1

Oct 27, 2025

2.1.0

Oct 14, 2025

2.0.6

Oct 3, 2025

2.0.5

Sep 26, 2025

This version

2.0.4

Aug 26, 2025

2.0.3

Aug 20, 2025

2.0.2

Aug 12, 2025

2.0.0

Aug 11, 2025

1.2.6

Jul 17, 2025

1.2.5

Jun 6, 2025

1.2.4

Jun 6, 2025

1.2.3

May 20, 2025

1.2.2

May 2, 2025

1.2.1

Apr 13, 2025

1.2.0

Mar 31, 2025

1.1.0

Mar 9, 2025

1.0.4

Feb 9, 2025

1.0.3

Feb 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nn_dataset-2.0.4.tar.gz (14.1 MB view details)

Uploaded Aug 26, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

nn_dataset-2.0.4-py3-none-any.whl (43.5 MB view details)

Uploaded Aug 26, 2025 Python 3

File details

Details for the file nn_dataset-2.0.4.tar.gz.

File metadata

Download URL: nn_dataset-2.0.4.tar.gz
Upload date: Aug 26, 2025
Size: 14.1 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for nn_dataset-2.0.4.tar.gz
Algorithm	Hash digest
SHA256	`a7a52a8bb94c3fca0ac02d64ffc9edb28976574d92b8448dea5ba9c9495a725d`
MD5	`db8a3e657f97fd1c01960bd783e8014b`
BLAKE2b-256	`5fdc938bb21f6cb4b61ca6784b192e68ecf6aa1ca58d03345ccd0c957c7f2eb7`

See more details on using hashes here.

File details

Details for the file nn_dataset-2.0.4-py3-none-any.whl.

File metadata

Download URL: nn_dataset-2.0.4-py3-none-any.whl
Upload date: Aug 26, 2025
Size: 43.5 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for nn_dataset-2.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1dfe783296b776e2215b6a328cd5e8362c18a3f52bceda99232087344ac25ca2`
MD5	`72f01b1a1e176fca9b3cdc3ab94ba685`
BLAKE2b-256	`cb84283c9c51b88b0f521f99d26c92f758314607b542b279e102924d206a49f6`

See more details on using hashes here.

nn-dataset 2.0.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Neural Network Dataset

Overview 📖

Create and Activate a Virtual Environment (recommended)

Installation or Update of the NN Dataset with pip

Environment for NN Dataset Contributors

Pip package manager

Usage

Reproducing Results with Fixed Training Parameters

To view supported flags:

Docker

Example of training LEMUR neural network within the AI Linux container (Linux host):

Contribution

Available Modules

Citation

Licenses

The idea and leadership of Dr. Ignatov

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes