A comprehensive benchmark suite for evaluating generative models for molecules
Project description
Molecule Benchmarks
A comprehensive benchmark suite for evaluating generative models for molecules. This package provides standardized metrics and evaluation protocols for assessing the quality of molecular generation models in drug discovery and cheminformatics.
Features
- Comprehensive Metrics: Validity, uniqueness, novelty, diversity, and similarity metrics
- Standard Benchmarks: Implements metrics from Moses, GuacaMol, and FCD papers
- Easy Integration: Simple interface for integrating with any generative model
- Multiple Datasets: Built-in support for QM9, Moses, and GuacaMol datasets
- Efficient Computation: Optimized for large-scale evaluation with multiprocessing support
Installation
pip install molecule-benchmarks
Quick Start
1. Implement Your Model
To use the benchmark suite, implement the MoleculeGenerationModel protocol:
from molecule_benchmarks.model import MoleculeGenerationModel
class MyGenerativeModel(MoleculeGenerationModel):
def __init__(self, model_path):
# Initialize your model here
self.model = load_model(model_path)
def generate_molecule_batch(self) -> list[str | None]:
"""Generate a batch of molecules as SMILES strings.
Returns:
List of SMILES strings. Return None for invalid molecules.
"""
# Your generation logic here
batch = self.model.generate(batch_size=100)
return [self.convert_to_smiles(mol) for mol in batch]
2. Run Benchmarks
from molecule_benchmarks import Benchmarker, SmilesDataset
# Load a dataset
dataset = SmilesDataset.load_qm9_dataset(subset_size=10000)
# Initialize benchmarker
benchmarker = Benchmarker(
dataset=dataset,
num_samples_to_generate=10000,
device="cpu" # or "cuda" for GPU
)
# Initialize your model
model = MyGenerativeModel("path/to/model")
# Run benchmarks
results = benchmarker.benchmark(model)
print(results)
3. Analyze Results
The benchmark returns comprehensive metrics:
# Validity metrics
print(f"Valid molecules: {results['validity']['valid_fraction']:.3f}")
print(f"Valid & unique: {results['validity']['valid_and_unique_fraction']:.3f}")
print(f"Valid & unique & novel: {results['validity']['valid_and_unique_and_novel_fraction']:.3f}")
# Diversity and similarity metrics
print(f"Internal diversity: {results['moses']['IntDiv']:.3f}")
print(f"SNN score: {results['moses']['snn_score']:.3f}")
# Chemical property distribution similarity
print(f"KL divergence score: {results['kl_score']:.3f}")
# Fréchet ChemNet Distance
print(f"FCD score: {results['fcd']['fcd']:.3f}")
Complete Example
Here's a complete example using the built-in dummy model:
from molecule_benchmarks import Benchmarker, SmilesDataset
from molecule_benchmarks.model import DummyMoleculeGenerationModel
# Load dataset
print("Loading dataset...")
dataset = SmilesDataset.load_qm9_dataset(subset_size=1000)
# Create benchmarker
benchmarker = Benchmarker(
dataset=dataset,
num_samples_to_generate=100,
device="cpu"
)
# Create a dummy model (replace with your model)
model = DummyMoleculeGenerationModel([
"CCO", # Ethanol
"CC(=O)O", # Acetic acid
"c1ccccc1", # Benzene
"CC(C)O", # Isopropanol
"CCN", # Ethylamine
None, # Invalid molecule
])
# Run benchmarks
print("Running benchmarks...")
results = benchmarker.benchmark(model)
# Print results
print("\n=== Validity Metrics ===")
print(f"Valid molecules: {results['validity']['valid_fraction']:.3f}")
print(f"Unique molecules: {results['validity']['unique_fraction']:.3f}")
print(f"Valid & unique: {results['validity']['valid_and_unique_fraction']:.3f}")
print(f"Novel molecules: {results['validity']['valid_and_unique_and_novel_fraction']:.3f}")
print("\n=== Moses Metrics ===")
print(f"Passing Moses filters: {results['moses']['fraction_passing_moses_filters']:.3f}")
print(f"SNN score: {results['moses']['snn_score']:.3f}")
print(f"Internal diversity (p=1): {results['moses']['IntDiv']:.3f}")
print(f"Internal diversity (p=2): {results['moses']['IntDiv2']:.3f}")
print("\n=== Distribution Metrics ===")
print(f"KL divergence score: {results['kl_score']:.3f}")
print(f"FCD score: {results['fcd']['fcd']:.3f}")
print(f"FCD (valid only): {results['fcd']['fcd_valid']:.3f}")
Supported Datasets
The package includes several built-in datasets:
from molecule_benchmarks import SmilesDataset
# QM9 dataset (small molecules)
dataset = SmilesDataset.load_qm9_dataset(subset_size=10000)
# Moses dataset (larger, drug-like molecules)
dataset = SmilesDataset.load_moses_dataset(fraction=0.1)
# GuacaMol dataset
dataset = SmilesDataset.load_guacamol_dataset(fraction=0.1)
# Custom dataset from files
dataset = SmilesDataset(
train_smiles="path/to/train.txt",
validation_smiles="path/to/valid.txt"
)
Metrics Explained
Validity Metrics
- Valid fraction: Percentage of generated molecules that are chemically valid
- Unique fraction: Percentage of generated molecules that are unique
- Novel fraction: Percentage of generated molecules not seen in training data
Moses Metrics
Based on the Moses paper:
- SNN score: Similarity to nearest neighbor in training set
- Internal diversity: Average pairwise Tanimoto distance within generated set
- Scaffold similarity: Similarity of molecular scaffolds to training set
- Fragment similarity: Similarity of molecular fragments to training set
Distribution Metrics
- KL divergence score: Measures similarity of molecular property distributions
- FCD score: Fréchet ChemNet Distance, measures distribution similarity in learned feature space
Advanced Usage
Custom Evaluation
# Custom number of samples and device
benchmarker = Benchmarker(
dataset=dataset,
num_samples_to_generate=50000,
device="cuda" # Use GPU for faster computation
)
# Run specific metric computations
results = benchmarker.benchmark(model)
validity_scores = benchmarker._compute_validity_scores(generated_smiles)
fcd_scores = benchmarker._compute_fcd_scores(generated_smiles)
Batch Processing
class BatchedModel(MoleculeGenerationModel):
def generate_molecule_batch(self) -> list[str | None]:
# Generate larger batches for efficiency
return self.model.sample(batch_size=1000)
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
This benchmark suite implements and builds upon metrics from several important papers:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file molecule_benchmarks-0.1.1.tar.gz.
File metadata
- Download URL: molecule_benchmarks-0.1.1.tar.gz
- Upload date:
- Size: 70.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b61ebec5c93481128aa871e0d59a2a0de6dfd7887e7c8aad3551b78c2eb243a7
|
|
| MD5 |
c5912e1ad0ff8d8be6fa632962418c5d
|
|
| BLAKE2b-256 |
17b455b9b8a268e28da54a759ad08963bd20eefba994bd246e80c8b1659295f1
|
File details
Details for the file molecule_benchmarks-0.1.1-py3-none-any.whl.
File metadata
- Download URL: molecule_benchmarks-0.1.1-py3-none-any.whl
- Upload date:
- Size: 39.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b002e6b360de73cf4aea46c5177d610ef783d248a32abebcc7dabe9e8952126e
|
|
| MD5 |
7363f10de4d8b7dfef9df0b275a74361
|
|
| BLAKE2b-256 |
af0a88c38a425d2e1dc97352073f0092e45e342e3768fc7571c7158d41aba90d
|