Skip to main content

atomgpt

Project description

AtomGPT: atomistic generative pre-trained transformer for forward and inverse materials design

Large language models (LLMs) such as ChatGPT have shown immense potential for various commercial applications, but their applicability for materials design remains underexplored. In this work, AtomGPT is introduced as a model specifically developed for materials design based on transformer architectures, demonstrating capabilities for both atomistic property prediction and structure generation tasks. This study shows that a combination of chemical and structural text descriptions can efficiently predict material properties with accuracy comparable to graph neural network models, including formation energies, electronic bandgaps from two different methods, and superconducting transition temperatures. Furthermore, AtomGPT can generate atomic structures for tasks such as designing new superconductors, with the predictions validated through density functional theory calculations. This work paves the way for leveraging LLMs in forward and inverse materials design, offering an efficient approach to the discovery and optimization of materials.

AtomGPT layer schematic

Both forward and inverse models take a config.json file as an input. Such a config file provides basic training parameters, and an id_prop.csv file path similar to the ALIGNN (https://github.com/usnistgov/alignn) model. See an example here: id_prop.csv.

Installation

First create a conda environment: Install miniconda environment from https://conda.io/miniconda.html Based on your system requirements, you'll get a file something like 'Miniconda3-latest-XYZ'.

Now,

bash Miniconda3-latest-Linux-x86_64.sh (for linux)
bash Miniconda3-latest-MacOSX-x86_64.sh (for Mac)

Download 32/64 bit python 3.10 miniconda exe and install (for windows)

conda create --name my_atomgpt python=3.10
conda activate my_atomgpt
git clone https://github.com/usnistgov/atomgpt.git
cd atomgpt
pip install -q -r dev-requirements.txt
pip install -q -e .

As an alternate method, AtomGPT can also be installed using pip command as follows:

pip install atomgpt

Forward model example (structure to property)

Forwards model are used for developing surrogate models for atomic structure to property predictions. It requires text input which can be either the raw POSCAR type files or a text description of the material. After that, we can use Google-T5/ OpenAI GPT2 etc. models with customizing langauage head for accomplishing such a task. The description of a material is generated with ChemNLP/describer function. If you turn convert to False, you can also train on bare POSCAR files.

atomgpt_forward --config_name atomgpt/examples/forward_model/config.json

Inverse model example (property to structure)

Inverse models are used for generating materials given property and description such as chemical formula. Currently, we use Mistral model, but other models such as Gemma, Lllama etc. can also be easily used. After the structure generation, we can optimize the structure with ALIGNN-FF model (example here and then subject to density functional theory calculations for a few selected candidates using JARVIS-DFT or similar workflow (tutorial for example here. Note that currently, the inversely model training as well as conference requires GPUs.

atomgpt_inverse --config_name atomgpt/examples/inverse_model/config.json

More detailed examples/case-studies would be added here soon.

Google colab/Jupyter notebook

Notebooks Google Colab Descriptions
Forward/Inverse Model training Open in Google Colab Example of installing AtomGPT, inverse model training for 5 sample materials, using the trained model for inference, relaxing structures with ALIGNN-FF, generating a database of atomic structures, train a forward prediction model.
HuggingFace model inference Open in Google Colab AtomGPT Structure Generation/Inference example with a model hosted on Huggingface.

For similar other notebook examples, see JARVIS-Tools-Notebook Collection

HuggingFace link :hugs:

https://huggingface.co/knc6

Referenes:

  1. AtomGPT: Atomistic Generative Pretrained Transformer for Forward and Inverse Materials Design
  2. ChemNLP: A Natural Language Processing based Library for Materials Chemistry Text Data
  3. JARVIS-Leaderboard
  4. NIST-JARVIS Infrastructure

How to contribute

For detailed instructions, please see Contribution instructions

Correspondence

Please report bugs as Github issues (https://github.com/usnistgov/atomgpt/issues) or email to kamal.choudhary@nist.gov.

Funding support

NIST-MGI (https://www.nist.gov/mgi) and CHIPS (https://www.nist.gov/chips)

Code of conduct

Please see Code of conduct

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atomgpt-2024.9.18.tar.gz (86.6 kB view details)

Uploaded Source

Built Distribution

atomgpt-2024.9.18-py2.py3-none-any.whl (97.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file atomgpt-2024.9.18.tar.gz.

File metadata

  • Download URL: atomgpt-2024.9.18.tar.gz
  • Upload date:
  • Size: 86.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for atomgpt-2024.9.18.tar.gz
Algorithm Hash digest
SHA256 0ddf21f4a568623f3894bfd9a05d87ad82452c0d37e20b7e360d97827fee8528
MD5 2aec2443ba38cd3cc6c7540e54dabd1e
BLAKE2b-256 cf542b1fae7e1026014c6eee8c108eed10968df14fc525171a9a66ba0be3a996

See more details on using hashes here.

File details

Details for the file atomgpt-2024.9.18-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for atomgpt-2024.9.18-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 847457c7dac65cf5a57bf37165b8eed5f7431ee9253643cf7d5e19f947c36377
MD5 12ae5ab0cccb05a0c08072bb8f30ddd6
BLAKE2b-256 bdf5e26cf1e13a6997dda0caf8dcd3240e4fc6fc7a5697548a724cdaa20002a4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page