LLM-Based Neural Network Generator
Project description
GPT-Driven Neural Network Generator
short alias lmurg
Overview 📖
This Python-based NNGPT project leverages large language models (LLMs) to automate the creation of neural network architectures, streamlining the design process for machine learning practitioners. It leverages various neural networks from the LEMUR Dataset to fine-tune LLMs and provide insights into potential architectures during the creation of new neural network models.
Create and Activate a Virtual Environment (recommended)
For Linux/Mac:
python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
For Windows:
python3 -m venv .venv
.venv\Scripts\activate
python -m pip install --upgrade pip
It is assumed that CUDA 12.6 is installed; otherwise, consider replacing 'cu126' with the appropriate version. Most LLM usage scenarios require GPUs with at least 24 GB of memory.
Installation of NNGPT with pip
pip install nn-gpt --extra-index-url https://download.pytorch.org/whl/cu126
pip install nn-gpt[flash] --no-build-isolation --extra-index-url https://download.pytorch.org/whl/cu126
Environment for NNGPT Developers
Pip package manager
Create a virtual environment, activate it, and run the following command to install all the project dependencies:
python -m pip install --upgrade pip
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu126
pip install -r req-no-isolation.txt --no-build-isolation --extra-index-url https://download.pytorch.org/whl/cu126
If there are installation problems, install the dependencies from the 'requirements.txt' file one by one.
Update of NN Dataset
Remove an old version and install LEMUR Dataset from GitHub to get the most recent code and statistics updates:
rm -rf db
pip install git+https://github.com/ABrain-One/nn-dataset --upgrade --force --extra-index-url https://download.pytorch.org/whl/cu126
Installing the stable version:
pip install nn-dataset --upgrade --extra-index-url https://download.pytorch.org/whl/cu126
Adding functionality to export data to Excel files and generate plots for analyzing neural network performance:
pip install nn-stat --upgrade --extra-index-url https://download.pytorch.org/whl/cu126
and export/generate:
python -m ab.stat.export
Docker
All versions of this project are compatible with AI Linux and can be seamlessly executed within the AI Linux Docker container.
Installing the latest version of the project from GitHub
docker run --rm -u $(id -u):ab -v $(pwd):/a/mm abrainone/ai-linux bash -c "[ -d nn-gpt ] && git -C nn-gpt pull || git -c advice.detachedHead=false clone --depth 1 https://github.com/ABrain-One/nn-gpt"
Running script
docker run --rm -u $(id -u):ab --shm-size=16G -v $(pwd)/nn-gpt:/a/mm abrainone/ai-linux bash -c "python -m ab.gpt.TuneNNGen_8B"
If recently added dependencies are missing in the AI Linux, you can create a container from the Docker image abrainone/ai-linux, install the missing packages (preferably using pip install <package name>), and then create a new image from the container using docker commit <container name> <new image name>. You can use this new image locally or push it to the registry for deployment on the computer cluster.
Use
-
ab.gpt.NNAlter*.py– Generates modified neural network models.
Use the-eargument to set the number of epochs for the initial CV model generation. -
ab.gpt.NNEval.py– Evaluates the models generated in the previous step. -
ab.gpt.TuneNNGen*.py– Performs fine-tuning and evaluation of an LLM. For evaluation purposes, the LLM generates neural network models, which are then trained to assess improvements in the LLM’s performance on this task. The -s flag allows skipping model generation for the specified number of epochs.
Citation
The original version of this project was created at the Computer Vision Laboratory of the University of Würzburg by the authors mentioned below. If you find this project to be useful for your research, please consider citing our articles for NNGPT framework and hyperparameter tuning:
@InProceedings{ABrain.HPGPT,
title={Optuna vs Code Llama: Are LLMs a New Paradigm for Hyperparameter Tuning?},
author={Kochnev, Roman and Goodarzi, Arash Torabi and Bentyn, Zofia Antonina and Ignatov, Dmitry and Timofte, Radu},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)},
year={2025}
}
@article{ABrain.NNGPT,
title = {NNGPT: Rethinking AutoML with Large Language Models},
author = {Kochnev, Roman and Khalid, Waleed and Uzun, Tolgay Atinc and Zhang, Xi and Dhameliya, Yashkumar Sanjaybhai and Qin, Furui and Ignatov, Dmitry and Timofte, Radu},
year = {2025}
}
Licenses
This project is distributed under the following licensing terms:
- models with pretrained weights under the legacy DeepSeek LLM V2 license
- all neural network models and their weights not covered by the above licenses, as well as all other files and assets in this project, are subject to the MIT license
The idea and leadership of Dr. Ignatov
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lmurg-1.0.3.tar.gz.
File metadata
- Download URL: lmurg-1.0.3.tar.gz
- Upload date:
- Size: 5.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
baea4472739d45683bd04f73c57c3ff6d5eaea7ff10dbbb9f0f874ff8870e410
|
|
| MD5 |
452a7bfa70054a9f0a1650728fd35a9a
|
|
| BLAKE2b-256 |
dfe6442deba6821c5c01c93e74bfd4a735e84535fc9c37addee6b1e963caa68d
|
File details
Details for the file lmurg-1.0.3-py3-none-any.whl.
File metadata
- Download URL: lmurg-1.0.3-py3-none-any.whl
- Upload date:
- Size: 5.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
52345a2fc5ede14b779a27976148b2222932f92c23acc9f296f97b154a635d92
|
|
| MD5 |
f847f98d266ef81220f4baa73e0276ae
|
|
| BLAKE2b-256 |
d92874b2adea6dc400887b5140323c434850b3466bc9c37f9ceedcb1cb116142
|