Skip to main content

LLM-Based Neural Network Generator

Project description

GPT-Driven Neural Network Generator

GitHub release
short alias lmurg


Overview 📖

This Python-based NNGPT project leverages large language models (LLMs) to automate the creation of neural network architectures, streamlining the design process for machine learning practitioners. It leverages various neural networks from the LEMUR Dataset to fine-tune LLMs and provide insights into potential architectures during the creation of new neural network models.

Create and Activate a Virtual Environment (recommended)

For Linux/Mac:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip

For Windows:

python3 -m venv .venv
.venv\Scripts\activate
python -m pip install --upgrade pip

It is also assumed that CUDA 12.6 is installed. If you have a different version, please replace 'cu126' with the appropriate version number.

Environment for NNGPT Developers

Pip package manager

Prerequisites for mpi4py package:

  • On Debian/Ubuntu systems, run:

       sudo apt install libmpich-dev    # for MPICH
    
       sudo apt install libopenmpi-dev  # for Open MPI
    
  • On Fedora/RHEL systems, run:

       sudo dnf install mpich-devel     # for MPICH
    
       sudo dnf install openmpi-devel   # for Open MPI
    

Create a virtual environment, activate it, and run the following command to install all the project dependencies:

python -m pip install --upgrade pip
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu126
pip install -r req-no-isolation.txt --no-build-isolation --extra-index-url https://download.pytorch.org/whl/cu126

If there are installation problems, install the dependencies from the 'requirements.txt' file one by one.

Update of NN Dataset

Remove an old version and install LEMUR Dataset from GitHub to get the most recent code and statistics updates:

rm -rf db
pip uninstall nn-dataset -y
pip install git+https://github.com/ABrain-One/nn-dataset --upgrade --force --extra-index-url https://download.pytorch.org/whl/cu126

Installing the stable version:

pip install nn-dataset --upgrade --extra-index-url https://download.pytorch.org/whl/cu126

Adding functionality to export data to Excel files and generate plots for analyzing neural network performance:

pip install nn-stat --upgrade --extra-index-url https://download.pytorch.org/whl/cu126

and export/generate:

python -m ab.stat.export

Docker

All versions of this project are compatible with AI Linux and can be seamlessly executed within the AI Linux Docker container.

Installing the latest version of the project from GitHub

docker run --rm -u $(id -u):ab -v $(pwd):/a/mm abrainone/ai-linux bash -c "[ -d nn-gpt ] && git -C nn-gpt pull || git -c advice.detachedHead=false clone --depth 1 https://github.com/ABrain-One/nn-gpt"

Running script

docker run --rm -u $(id -u):ab --shm-size=16G -v $(pwd)/nn-gpt:/a/mm abrainone/ai-linux bash -c "python -m ab.gpt.TuneNNGen_8B"

The recently added dependencies might be missing in the AI Linux. In this case, you can create a container from the Docker image abrainone/ai-linux, install the missing packages (preferably using pip install <package name>), and then create a new image from the container using docker commit <container name> <new image name>. You can use this new image locally or push it to the registry for deployment on the computer cluster.

Use

  • ab.gpt.NNAlter*.py – Generates modified neural network models.
    Use the -e argument to set the number of epochs for the initial CV model generation.

  • ab.gpt.NNEval.py – Evaluates the models generated in the previous step.

  • ab.gpt.TuneNNGen*.py – Performs fine-tuning and evaluation of an LLM. For evaluation purposes, the LLM generates neural network models, which are then trained to assess improvements in the LLM’s performance on this task. The -s flag allows skipping model generation for the specified number of epochs.

Pretrained LLM weights

Citation

The original version of this project was created at the Computer Vision Laboratory of the University of Würzburg by the authors mentioned below. If you find this project to be useful for your research, please consider citing our articles for NNGPT framework and hyperparameter tuning:

@article{ABrain.NNGPT,
  title        = {NNGPT: Rethinking AutoML with Large Language Models},
  author       = {Kochnev, Roman and Khalid, Waleed and Uzun, Tolgay Atinc and Zhang, Xi and Dhameliya, Yashkumar Sanjaybhai and Qin, Furui and Ignatov, Dmitry and Timofte, Radu},
  year         = {2025}
}

@article{ABrain.HPGPT,
  title={Optuna vs Code Llama: Are LLMs a New Paradigm for Hyperparameter Tuning?},
  author={Kochnev, Roman and Goodarzi, Arash Torabi and Bentyn, Zofia Antonina and Ignatov, Dmitry and Timofte, Radu},
  journal={arXiv preprint arXiv:2504.06006},
  year={2025}
}

Licenses

This project is distributed under the following licensing terms:

  • models with pretrained weights under the legacy DeepSeek LLM V2 license
  • all neural network models and their weights not covered by the above licenses, as well as all other files and assets in this project, are subject to the MIT license

The idea and leadership of Dr. Ignatov

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nn_gpt-1.0.2.tar.gz (88.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nn_gpt-1.0.2-py3-none-any.whl (208.4 kB view details)

Uploaded Python 3

File details

Details for the file nn_gpt-1.0.2.tar.gz.

File metadata

  • Download URL: nn_gpt-1.0.2.tar.gz
  • Upload date:
  • Size: 88.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for nn_gpt-1.0.2.tar.gz
Algorithm Hash digest
SHA256 6ecf65cd8c48a9e346fb2a4d7e37706fd863d8d471e2182ee026559754fdac4a
MD5 0c24ff300e1f58b2f35fb9e6cc54ae02
BLAKE2b-256 437741308234f58d0dc4d08944bc933ec2069cd0d8c99012b033cd0d03ad9b40

See more details on using hashes here.

File details

Details for the file nn_gpt-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: nn_gpt-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 208.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for nn_gpt-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0658785e913245691983f281dc16f601eb7d0b18ccfae1d62457a5302c342d9e
MD5 cc6ef41b15a802f3d2ef589c72af71a2
BLAKE2b-256 d43adef8b74f7be671865cf6e9c6948cf4d016e54ff6b5c79abe829bec6dc20d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page