High-performance NVIDIA Warp primitives for GPU-enabled computational chemistry and atomistic simulation workflows.
Project description
NVIDIA ALCHEMI Toolkit-Ops
High-performance NVIDIA Warp primitives for computational chemistry
NVIDIA ALCHEMI Toolkit-Ops is a collection of GPU-optimized, batched primitives for
accelerating atomistic simulations. High performance compute kernels are written
in NVIDIA warp-lang.
Key Features
- Neighborhood computation, including naive $O(N^2)$ and cell list $O(N)$ implementations
- Dispersion corrections via Becke-Johnson damped DFT-D3
Kernels are naturally intended to be highly scalable (>100,000 atoms) and generally optimized for high throughput operations (on the order of several microseconds per atom) on GPUs, with batching support.
Use Cases
There are currently three primary use cases where we imagine nvalchemi-toolkit-ops to
fit into the ecosystem:
- Library maintainers and developers are encouraged to benchmark and explore integrating functionality like neighbor list computation to accelerate existing workflows;
- Researchers and model developers ideally should be able to rely on this package (and not implement their own!) for neighbor list computation, interatomic interactions, and so on during method development;
- Engineers looking to build applications that involve molecular dynamics,
interatomic potentials, and the like can take advantage of optimized and
maintained low-level kernels.
warp-langkernels should be sufficiently modular to allow for a high degree of flexibility and reusability.
The combination of being GPU-first and batched should enable the kernels contained
in nvalchemi-toolkit-ops to be ready for a wide range of research and production
applications.
Example Snippets
We encourage interested readers to browse our hosted documentation. Below are some short snippets that highlight our straightforward API and use cases:
Neighbor list in a 2D unit cell with 50,000 atoms
This example uses PyTorch:
import torch
from nvalchemiops.neighborlist import neighbor_list
torch.set_default_dtype(torch.float32)
torch.set_default_device(torch.device("cuda"))
NUM_ATOMS = 50_000
# arbitrarily scale positions
positions = torch.randn((NUM_ATOMS, 3)) * 10.0
cell = torch.eye(3, dtype=torch.float32).unsqueeze(0)
pbc = torch.tensor([True, True, False], dtype=torch.bool)
cutoff = 6.0
# use padded matrix representation for neighbors, optimal for
# compiled applications that need constant shapes
neighbor_matrix, num_neighbors, shift_matrix = neighbor_list(
positions,
cutoff,
cell=cell,
pbc=pbc,
method="cell_list"
)
# ...or pass `return_neighbor_list=True` for the familiar COO
# `edge_index` format. `method` will also automatically determine
# neighbor algorithm based off system size
edge_index, neighbor_ptr, shifts = neighbor_list(
positions,
cutoff,
cell=cell,
pbc=pbc,
return_neighbor_list=True
)
DFT-D3(BJ) corrections on a batch of molecules
This example assumes you already have concatenated a set of molecules
into combined tensors, and have computed some form of neighborhood
using the neighbor_list API. Here, we'll demonstrate using the
matrix representation:
import torch
from nvalchemiops.interactions.dispersion import dftd3
# the following parameters need to be constructed ahead of time
positions = ... # [num_atoms, 3]
atomic_numbers = ... # [num_atoms]
cell = ... # [num_systems, 3, 3]
pbc = ... # [num_systems, 3]
batch_idx = ... # [num_atoms]
batch_ptr = ... # [num_systems + 1]
# construct neighbor matrix
neighbor_matrix, num_neighbors, shift_matrix = neighbor_list(
positions,
cutoff=..., # on the order of ~20 Angstroms
cell=cell,
pbc=pbc,
batch_idx=batch_idx,
batch_ptr=batch_ptr
)
# DFT-D3 parameters need to be provided, which comprises reference C6 parameters.
# Refer to the user documentation to see the expected structure and data source.
d3_params = ...
# pass everything to the functional interface
d3_energies, d3_forces, coord_nums, d3_virials = dftd3(
positions=positions,
numbers=atomic_numbers,
neighbor_matrix=neighbor_matrix,
neighbor_matrix_shifts=shift_matrix,
batch_idx=batch_idx,
# functional specific DFT-D3 parameters
a1=..., a2=..., s8=...,
d3_params=d3_params,
compute_virial=True
)
Roadmap
Currently, we have the following features that are being implemented, or are planned to be implemented soon:
- A variety of widely used pair-potentials, including
- Ziebler-Biersack-Littmark (ZBL)
- Coulomb
- Lennard-Jones
- Ewald and particle mesh Ewald (PME)
- Quantum Drude Oscillator
- DFT-D4 dispersion corrections
- Primitives typically used with machine learned interatomic potentials, such as:
- Polynomial basis functions
- (Segmented) graph operations
- Mixed precision, e.g. higher precision accumulation
- JAX wrappers
Contributions & Disclaimers
Currently, NVIDIA ALCHEMI Toolkit-Ops is undergoing a public beta, where we are soliciting feedback from the community. During this time, direct code contributions are not accepted as our first priority will be to define and provide a stable API, which is/will be subject to change. Feature requests, discussions, and general feedback are welcome and encouraged via Github Issues.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nvalchemi_toolkit_ops-0.1.0-py3-none-any.whl.
File metadata
- Download URL: nvalchemi_toolkit_ops-0.1.0-py3-none-any.whl
- Upload date:
- Size: 84.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7acb791d7e9c37a3ceca0955c5d7ef4bda532b772f835ee21353ef8d8a54d9e6
|
|
| MD5 |
8ea155630efe344346b2094b9b50f0d9
|
|
| BLAKE2b-256 |
fd807d01ce5f1476b9090325e0446246f740d06969451fe9239b0a3eb3d54788
|