A comprehensive benchmark suite for evaluating generative models for molecules

These details have not been verified by PyPI

Project links

Project description

Molecule Benchmarks

A comprehensive benchmark suite for evaluating generative models for molecules. This package provides standardized metrics and evaluation protocols for assessing the quality of molecular generation models in drug discovery and cheminformatics.

Features

Comprehensive Metrics: Validity, uniqueness, novelty, diversity, and similarity metrics
Standard Benchmarks: Implements metrics from Moses, GuacaMol, and FCD papers
Easy Integration: Simple interface for integrating with any generative model
Multiple Datasets: Built-in support for QM9, Moses, and GuacaMol datasets
Efficient Computation: Optimized for large-scale evaluation with multiprocessing support

Installation

pip install molecule-benchmarks

Quick Start

1. Implement Your Model

To use the benchmark suite, implement the MoleculeGenerationModel protocol:

from molecule_benchmarks.model import MoleculeGenerationModel

class MyGenerativeModel(MoleculeGenerationModel):
    def __init__(self, model_path):
        # Initialize your model here
        self.model = load_model(model_path)
    
    def generate_molecule_batch(self) -> list[str | None]:
        """Generate a batch of molecules as SMILES strings.
        
        Returns:
            List of SMILES strings. Return None for invalid molecules.
        """
        # Your generation logic here
        batch = self.model.generate(batch_size=100)
        return [self.convert_to_smiles(mol) for mol in batch]

2. Run Benchmarks

from molecule_benchmarks import Benchmarker, SmilesDataset

# Load a dataset
dataset = SmilesDataset.load_qm9_dataset(subset_size=10000)

# Initialize benchmarker
benchmarker = Benchmarker(
    dataset=dataset,
    num_samples_to_generate=10000,
    device="cpu"  # or "cuda" for GPU
)

# Initialize your model
model = MyGenerativeModel("path/to/model")

# Run benchmarks
results = benchmarker.benchmark(model)
print(results)

3. Analyze Results

The benchmark returns comprehensive metrics:

# Validity metrics
print(f"Valid molecules: {results['validity']['valid_fraction']:.3f}")
print(f"Valid & unique: {results['validity']['valid_and_unique_fraction']:.3f}")
print(f"Valid & unique & novel: {results['validity']['valid_and_unique_and_novel_fraction']:.3f}")

# Diversity and similarity metrics
print(f"Internal diversity: {results['moses']['IntDiv']:.3f}")
print(f"SNN score: {results['moses']['snn_score']:.3f}")

# Chemical property distribution similarity
print(f"KL divergence score: {results['kl_score']:.3f}")

# Fréchet ChemNet Distance
print(f"FCD score: {results['fcd']['fcd']:.3f}")

Complete Example

Here's a complete example using the built-in dummy model:

from molecule_benchmarks import Benchmarker, SmilesDataset
from molecule_benchmarks.model import DummyMoleculeGenerationModel

# Load dataset
print("Loading dataset...")
dataset = SmilesDataset.load_qm9_dataset(subset_size=1000)

# Create benchmarker
benchmarker = Benchmarker(
    dataset=dataset,
    num_samples_to_generate=100,
    device="cpu"
)

# Create a dummy model (replace with your model)
model = DummyMoleculeGenerationModel([
    "CCO",           # Ethanol
    "CC(=O)O",       # Acetic acid
    "c1ccccc1",      # Benzene
    "CC(C)O",        # Isopropanol
    "CCN",           # Ethylamine
    None,            # Invalid molecule
])

# Run benchmarks
print("Running benchmarks...")
results = benchmarker.benchmark(model)

# Print results
print("\n=== Validity Metrics ===")
print(f"Valid molecules: {results['validity']['valid_fraction']:.3f}")
print(f"Unique molecules: {results['validity']['unique_fraction']:.3f}")
print(f"Valid & unique: {results['validity']['valid_and_unique_fraction']:.3f}")
print(f"Novel molecules: {results['validity']['valid_and_unique_and_novel_fraction']:.3f}")

print("\n=== Moses Metrics ===")
print(f"Passing Moses filters: {results['moses']['fraction_passing_moses_filters']:.3f}")
print(f"SNN score: {results['moses']['snn_score']:.3f}")
print(f"Internal diversity (p=1): {results['moses']['IntDiv']:.3f}")
print(f"Internal diversity (p=2): {results['moses']['IntDiv2']:.3f}")

print("\n=== Distribution Metrics ===")
print(f"KL divergence score: {results['kl_score']:.3f}")
print(f"FCD score: {results['fcd']['fcd']:.3f}")
print(f"FCD (valid only): {results['fcd']['fcd_valid']:.3f}")

Supported Datasets

The package includes several built-in datasets:

from molecule_benchmarks import SmilesDataset

# QM9 dataset (small molecules)
dataset = SmilesDataset.load_qm9_dataset(subset_size=10000)

# Moses dataset (larger, drug-like molecules)
dataset = SmilesDataset.load_moses_dataset(fraction=0.1)

# GuacaMol dataset
dataset = SmilesDataset.load_guacamol_dataset(fraction=0.1)

# Custom dataset from files
dataset = SmilesDataset(
    train_smiles="path/to/train.txt",
    validation_smiles="path/to/valid.txt"
)

Metrics Explained

Validity Metrics

Valid fraction: Percentage of generated molecules that are chemically valid
Unique fraction: Percentage of generated molecules that are unique
Novel fraction: Percentage of generated molecules not seen in training data

Moses Metrics

Based on the Moses paper:

SNN score: Similarity to nearest neighbor in training set
Internal diversity: Average pairwise Tanimoto distance within generated set
Scaffold similarity: Similarity of molecular scaffolds to training set
Fragment similarity: Similarity of molecular fragments to training set

Distribution Metrics

KL divergence score: Measures similarity of molecular property distributions
FCD score: Fréchet ChemNet Distance, measures distribution similarity in learned feature space

Advanced Usage

Custom Evaluation

# Custom number of samples and device
benchmarker = Benchmarker(
    dataset=dataset,
    num_samples_to_generate=50000,
    device="cuda"  # Use GPU for faster computation
)

# Run specific metric computations
results = benchmarker.benchmark(model)
validity_scores = benchmarker._compute_validity_scores(generated_smiles)
fcd_scores = benchmarker._compute_fcd_scores(generated_smiles)

Batch Processing

class BatchedModel(MoleculeGenerationModel):
    def generate_molecule_batch(self) -> list[str | None]:
        # Generate larger batches for efficiency
        return self.model.sample(batch_size=1000)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

This benchmark suite implements and builds upon metrics from several important papers:

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.14

Nov 14, 2025

0.1.13

Aug 31, 2025

0.1.12

Jul 8, 2025

0.1.11

Jul 6, 2025

0.1.10

Jul 4, 2025

0.1.9

Jul 4, 2025

0.1.8

Jun 29, 2025

0.1.7

Jun 28, 2025

0.1.6

Jun 28, 2025

0.1.5

Jun 27, 2025

0.1.4

Jun 27, 2025

0.1.3

Jun 27, 2025

0.1.2

Jun 27, 2025

0.1.1

Jun 27, 2025

This version

0.1.0

Jun 27, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

molecule_benchmarks-0.1.0.tar.gz (70.1 kB view details)

Uploaded Jun 27, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

molecule_benchmarks-0.1.0-py3-none-any.whl (39.8 kB view details)

Uploaded Jun 27, 2025 Python 3

File details

Details for the file molecule_benchmarks-0.1.0.tar.gz.

File metadata

Download URL: molecule_benchmarks-0.1.0.tar.gz
Upload date: Jun 27, 2025
Size: 70.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.13

File hashes

Hashes for molecule_benchmarks-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`bbe03034072b31f6ab54dfa353392732bd68e7518dfe0ea1b16580749a7395b6`
MD5	`f280fa1cfbe0051ac28c1747fcbd7b86`
BLAKE2b-256	`9c5492f0a07d4b3d319a9010f39a03fd68ee2b4d0b2182fb35e44eee451aefe9`

See more details on using hashes here.

File details

Details for the file molecule_benchmarks-0.1.0-py3-none-any.whl.

File metadata

Download URL: molecule_benchmarks-0.1.0-py3-none-any.whl
Upload date: Jun 27, 2025
Size: 39.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.13

File hashes

Hashes for molecule_benchmarks-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ca4effc3fc9556d8d785b97c3a0de992fdf81852fb6f0f86ef88e83ef92a8994`
MD5	`90c68434d2867c791a9ceaebfa393896`
BLAKE2b-256	`5f23da2ee2fe08bf8b8656cac92235bbafb57b01f80294a249d5c0b86d89a9f3`

See more details on using hashes here.

molecule-benchmarks 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Molecule Benchmarks

Features

Installation

Quick Start

1. Implement Your Model

2. Run Benchmarks

3. Analyze Results

Complete Example

Supported Datasets

Metrics Explained

Validity Metrics

Moses Metrics

Distribution Metrics

Advanced Usage

Custom Evaluation

Batch Processing

Contributing

License

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes