Fast Organic Crystal Structure Prediction with Unit Cell Flow Matching
Project description
CLARI: Fast Organic Crystal Structure Prediction with Unit Cell Flow Matching
CLARI takes a molecule and predicts how it packs into a crystal. A single run produces many candidate structures.
Links
- Source code
- Paper: Fast Organic Crystal Structure Prediction with Unit Cell Flow Matching
- Checkpoints and data
Checkpoints are available on Hugging Face as clari-large.ckpt and clari-med.ckpt.
Inputs are expected to use explicit-hydrogen SMILES. For example, prefer
C([H])([H])([H])C([H])([H])[H] over CC.
Basic sampling
CLI
uv run sample \
--checkpoint_path clari.ckpt \
--output_dir out/ \
--smiles 'C([H])([H])([H])C([H])([H])[H]' \
--ids ethane \
--n_samples 8
Python
from clari.inference import ClariSampler, SampleRequest
sampler = ClariSampler.from_checkpoint("clari.ckpt")
samples = sampler.sample(
SampleRequest(
id="ethane",
smiles="C([H])([H])([H])C([H])([H])[H]",
n_samples=8,
),
output_dir="out/",
)
Load from the Hub instead of a local file (downloads once, then cached):
CLI
uv run sample \
--from_hub Clari-M \
--output_dir out/ \
--smiles 'C([H])([H])([H])C([H])([H])[H]' \
--ids ethane \
--n_samples 8
Python
sampler = ClariSampler.from_hub("Clari-M") # or "Clari-L"
Multiple molecules
CLI
uv run sample \
--checkpoint_path clari.ckpt \
--output_dir out/ \
--smiles 'C([H])([H])([H])C([H])([H])O[H]' --ids ethanol \
--smiles 'C1([H])=C([H])C([H])=C([H])C([H])=C1[H]' --ids benzene \
--copies 4 \
--n_samples 50
Python
from clari.inference import ClariSampler, SampleRequest
sampler = ClariSampler.from_checkpoint("clari.ckpt")
samples = sampler.sample([
SampleRequest(
id="ethanol",
smiles="C([H])([H])([H])C([H])([H])O[H]",
copies=4,
n_samples=50,
),
SampleRequest(
id="benzene",
smiles="C1([H])=C([H])C([H])=C([H])C([H])=C1[H]",
copies=4,
n_samples=50,
),
], output_dir="out/")
For co-crystals, pass (SMILES, copy_count) pairs. Pair-level copy counts are passed
directly to Crystal.from_smiles.
samples = sampler.sample(
SampleRequest(
id="ethanol-water",
smiles=[
("C([H])([H])([H])C([H])([H])O[H]", 1),
("O([H])[H]", 1),
],
n_samples=50,
),
output_dir="out/",
)
For many molecules, use a config file instead of repeating flags:
uv run sample --config jobs.json
{
"checkpoint_path": "clari.ckpt",
"output_dir": "out/",
"smiles": [
"C([H])([H])([H])C([H])([H])O[H]",
"C1([H])=C([H])C([H])=C([H])C([H])=C1[H]"
],
"ids": ["ethanol", "benzene"],
"copies": [4, 4],
"n_samples": [50, 50]
}
Co-crystal configs use the same pair shape:
{
"checkpoint_path": "clari.ckpt",
"output_dir": "out/",
"ids": "ethanol-water",
"smiles": [
["C([H])([H])([H])C([H])([H])O[H]", 1],
["O([H])[H]", 1]
],
"n_samples": 50
}
Sample → rank → export top-K
CLI
uv run sample \
--checkpoint_path clari.ckpt \
--output_dir out/ \
--smiles 'C([H])([H])([H])C([H])([H])[H]' \
--ids ethane \
--copies 4 \
--n_samples 64
uv run rank out/
uv run export-cifs out/ --top_k 10
Python
sampler = ClariSampler.from_checkpoint("clari.ckpt")
sampler.sample(
"C([H])([H])([H])C([H])([H])[H]",
copies=4,
n_samples=64,
output_dir="out/",
)
# rank and export are CLI steps
Export specific samples by index
uv run export-cifs out/ --sample_idx 0 --sample_idx 7
Multi-GPU
CLI
uv run sample \
--checkpoint_path clari.ckpt \
--output_dir out/ \
--smiles 'C([H])([H])([H])C([H])([H])[H]' \
--ids ethane \
--copies 4 \
--n_samples 1000 \
--num_gpus 4
Python
sampler = ClariSampler.from_checkpoint("clari.ckpt", num_gpus=4)
sampler.sample(
"C([H])([H])([H])C([H])([H])[H]",
copies=4,
n_samples=1000,
output_dir="out/",
)
Fixed batch size
CLI
uv run sample \
--checkpoint_path clari.ckpt \
--output_dir out/ \
--smiles 'C([H])([H])([H])C([H])([H])[H]' \
--ids ethane \
--copies 4 \
--n_samples 32 \
--batch_size 8 \
--compile false
Python
sampler = ClariSampler.from_checkpoint("clari.ckpt", compile=False)
sampler.sample(
"C([H])([H])([H])C([H])([H])[H]",
copies=4,
n_samples=32,
batch_size=8,
output_dir="out/",
)
CPU smoke test
CLI
uv run sample \
--checkpoint_path clari.ckpt \
--output_dir out/ \
--smiles 'C([H])([H])([H])C([H])([H])[H]' \
--ids ethane \
--n_samples 1 \
--batch_size 1 \
--device cpu \
--n_steps 2 \
--compile false \
--use_bf16 false
Python
sampler = ClariSampler.from_checkpoint(
"clari.ckpt",
device="cpu",
n_steps=2,
compile=False,
use_bf16=False,
)
sampler.sample(
"C([H])([H])([H])C([H])([H])[H]",
n_samples=1,
batch_size=1,
output_dir="out/",
)
For all options: uv run sample --help
Citation
@misc{lo2026fastorganiccrystalstructure,
title={Fast Organic Crystal Structure Prediction with Unit Cell Flow Matching},
author={Alston Lo and Luka Mucko and Austin H. Cheng and Andy Cai and Alastair J. A. Price and Wojciech Matusik and Alán Aspuru-Guzik},
year={2026},
eprint={2606.03199},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2606.03199},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file clari-0.1.0.tar.gz.
File metadata
- Download URL: clari-0.1.0.tar.gz
- Upload date:
- Size: 1.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8966cad2b861b9782c971966b6b8977e7a040c3ce372c8b7b70cca5b84981d23
|
|
| MD5 |
9abc6286bb012c2acf4317de17de1827
|
|
| BLAKE2b-256 |
c8cdd449fa3edfaf89dd5ddc0818cc87443756b43c7441e1f6679d8d6e35425f
|
File details
Details for the file clari-0.1.0-py3-none-any.whl.
File metadata
- Download URL: clari-0.1.0-py3-none-any.whl
- Upload date:
- Size: 99.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b68d604b0b12965005291233c68f2c013f8b39acd6c9456dc68f370db4e824b5
|
|
| MD5 |
e5eeaa77fa83d7e74f06097f11c2e25b
|
|
| BLAKE2b-256 |
7e4461ba8a3f91845ecf40faab0372699b49668f3945ffeb2f5fedf3eb38f5c8
|