Skip to main content

Deep and machine learning for atom-resolved data

Project description

PyPI version Build Status Documentation Status

Codacy Badge Downloads

Colab Gitpod ready-to-code

AtomAI

What is AtomAI

AtomAI is a simple Python package for machine learning-based analysis of experimental atomic-scale and mesoscale data from electron and scanning probe microscopes, which doesn't require any advanced knowledge of Python (or machine learning). It is the next iteration of the AICrystallographer project.

How to use it

AtomAI has two main modules: atomnet and atomstat. The atomnet is for training neural networks (with just one line of code) and for applying trained models to finding atoms and defects in image data. The atomstat allows taking the atomnet predictions and performing the statistical analysis on the local image descriptors associated with the identified atoms and defects (e.g., principal component analysis of atomic distortions in a single image or computing gaussian mixture model components with the transition probabilities for movies).

Quickstart: AtomAI in the Cloud

The easiest way to start using AtomAI is via Google Colab

  1. Train a deep fully convolutional neural network for atom finding

  2. Multivariate statistical analysis of distortion domains in a single atomic image

  3. Variational autoencoders for analysis of structural transformations

  4. Prepare training data from experimental image with atomic coordinates

Model training

Below is an example of how one can train a neural network for atom/particle/defect finding with essentially one line of code:

from atomai import atomnet

# Load your training/test data (as numpy arrays or lists of numpy arrays)
dataset = np.load('training_data.npz')
images_all, labels_all, images_test_all, labels_test_all = dataset.values()

# Train a model
trained_model = atomnet.train_single_model(
    images_all, labels_all, images_test_all, labels_test_all,  # train and test data
    gauss_noise=True, zoom=True,  # on-the-fly data augmentation
    training_cycles=500, swa=True)  # train for 500 iterations with stochastic weights averaging at the end  

One can also train an ensemble of models instead of just a single model. The average ensemble prediction is usually more accurate and reliable than that of the single model. In addition, we also get the information about the uncertainty in our prediction for each pixel.

# Initialize ensemble trainer
etrainer = atomnet.ensemble_trainer(images_all, labels_all, images_test_all, labels_test_all,
                                    rotation=True, zoom=True, gauss_noise=True, # On-the fly data augmentation
                                    strategy="from_baseline", swa=True, n_models=30, model="dilUnet",
                                    training_cycles_base=1000, training_cycles_ensemble=100)
# Train deep ensemble of models
ensemble, amodel = etrainer.run()

Prediction with trained model(s)

Trained model is used to find atoms/particles/defects in the previously unseen (by a model) experimental data:

# Here we load new experimental data (as 2D or 3D numpy array)
expdata = np.load('expdata.npy')

# Initialize predictive object (can be reused for other datasets)
spredictor = atomnet.predictor(trained_model, use_gpu=True, refine=False)
# Get model's "raw" prediction, atomic coordinates and classes
nn_output, coord_class = spredictor.run(expdata)

One can also make a prediction with uncertainty estimates using the ensemble of models:

epredictor = atomnet.ensemble_predictor(amodel, ensemble, calculate_coordinates=True, eps=0.5)
(out_mu, out_var), (coord_mu, coord_var) = epredictor.run(expdata)

(Note: In some cases, it may be easier to get coordinates by simply running atomnet.locator(*args, *kwargs).run(out_mu) on the mean "raw" prediction of the ensemble)

Statistical analysis

The information extracted by atomnet can be further used for statistical analysis of raw and "decoded" data. For example, for a single atom-resolved image of ferroelectric material, one can identify domains with different ferroic distortions:

from atomai import atomstat

# Get local descriptors
imstack = atomstat.imlocal(nn_output, coordinates, window_size=32, coord_class=1)

# Compute distortion "eigenvectors" with associated loading maps and plot results:
pca_results = imstack.imblock_pca(n_components=4, plot_results=True)

For movies, one can extract trajectories of individual defects and calculate the transition probabilities between different classes:

# Get local descriptors (such as subimages centered around impurities)
imstack = atomstat.imlocal(nn_output, coordinates, window_size=32, coord_class=1)

# Calculate Gaussian mixture model (GMM) components
components, imgs, coords = imstack.gmm(n_components=10, plot_results=True)

# Calculate GMM components and transition probabilities for different trajectories
transitions_dict = imstack.transition_matrix(n_components=10, rmax=10)

# and more

Variational autoencoders

In addition to multivariate statistical analysis, one can also use variational autoencoders (VAEs) in AtomAI to find in the unsupervised fashion the most effective reduced representation of system's local descriptors. The VAEs can be applied to both raw data and NN output, but typically work better with the latter.

from atomai import atomstat, utils

# Get stack of subimages from a movie
imstack, com, frames = utils.extract_subimages(decoded_imgs, coords, window_size=32)

# Initialize and train rotationally-invariant VAE
rvae = atomstat.rVAE(imstack, latent_dim=2, training_cycles=200)
rvae.run()

# Visualize the learned manifold
rvae.manifold2d()

Installation

First, install PyTorch. Then, install AtomAI via

pip install atomai

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

atomai-0.5.1.tar.gz (61.8 kB view details)

Uploaded Source

Built Distribution

atomai-0.5.1-py3-none-any.whl (68.5 kB view details)

Uploaded Python 3

File details

Details for the file atomai-0.5.1.tar.gz.

File metadata

  • Download URL: atomai-0.5.1.tar.gz
  • Upload date:
  • Size: 61.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/50.1.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.6

File hashes

Hashes for atomai-0.5.1.tar.gz
Algorithm Hash digest
SHA256 1788143f5ac607c98cf178182c10e57d9b6c1a8f58419a397a8bc5eecf3dce1f
MD5 92a169528664213340aa26e3f9d05111
BLAKE2b-256 cb920b07c6acff4cb0bd5607a6a52b59c0e1bb2d659e4c90763de5f87bf981a4

See more details on using hashes here.

File details

Details for the file atomai-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: atomai-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 68.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/50.1.0 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.6

File hashes

Hashes for atomai-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0176f1a6a6e91480e6ab9fe15734593c616ac826bf46e218f22172d4332d82c6
MD5 276a1c8c69d002e277f9111c04ca2b2e
BLAKE2b-256 05cc2ecafd7289b5f75a43e5e54a21dcd1dab8d4c9be424f4d1aa53c1a630b2d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page