Skip to main content

Active Learning framework for atomistic simulations with flexible workflows and HPC submission.

Project description

ALomancy 🔮

Modular Active Learning Workflows for Modern Computational Chemistry

PyPI version Python 3.9+ License: MIT Tests codecov Code style: ruff Documentation Status

InstallationQuick StartDocumentationExamplesContributing


🎯 Overview

ALomnacy is a Python framework for running active learning (AL) workflows for training machine-learned inter-atomic potentials (MLIPs). This package focusses on customization and reproducibility to build robust training datasets and train MLIPs.

Key Features

  • 🚀 Automated AL Workflows: End-to-end active learning with minimal manual intervention
  • 🔧 HPC Integration: Built-in support for remote job submission on HPC clusters
  • Parallelization: Ensures that jobs run concurrently where possible increasing speed to results
  • 🔄 Extensible Design: Abstract base classes for easy customization and extension
  • 📊 Analysis Tools: Built-in utilities for monitoring and analyzing AL progress

Workflow Overview

flowchart TD
    A((Initial Dataset)) --> B[Train MLIP Committee]
    B --> C[Structure Generation]
    C --> D((Uncertainty-based Selection))
    D --> E[High-Accuracy Evaluation]
    E --> F((Update Training Dataset))
    F --> B

    style A fill:#e66027
    style B fill:#e87322
    style C fill:#eb861e
    style D fill:#efa119
    style E fill:#708e4c
    style F fill:#328566

🚀 Installation

From PyPI (Recommended)

pip install alomancy

From Source

git clone https://github.com/yourusername/ALomnacy.git
cd ALomnacy
pip install -e ".[dev]"

Dependencies

  • Python 3.9+
  • ASE - Atomic Simulation Environment
  • WFL - Workflow for atomistic simulations
  • Expyre - Remote job execution

⚡ Quick Start

1. Basic Active Learning Workflow

from alomancy.core import StandardActiveLearningWorkflow
from pathlib import Path

# Initialize workflow
workflow = StandardActiveLearningWorkflow(
    initial_train_file_path="train_set.xyz",
    initial_test_file_path="test_set.xyz",
    config_file_path="config.yaml",
    number_of_al_loops=5,
    verbose=1
)

# Run the active learning workflow
workflow.run()

2. Configuration File

Create a config.yaml file to specify your computational setup:

mlip_committee:
  name: "mace_training"
  size_of_committee: 4
  epochs: 1000
  max_time: "24:00:00"
  hpc:
    hpc_name: "gpu_cluster"
    partitions: ["gpu"]
    pre_cmds: ["module load cuda", "source activate mace"]

structure_generation:
  name: "md_generation"
  number_of_concurrent_jobs: 8
  desired_number_of_structures: 100
  max_time: "12:00:00"
  hpc:
    hpc_name: "gpu_cluster"
    partitions: ["gpu"]
    pre_cmds: ["module load cuda", "source activate mace"]

high_accuracy_evaluation:
  name: "dft_evaluation"
  max_time: "48:00:00"
  hpc:
    hpc_name: "cpu_cluster"
    partitions: ["cpu"]
    pre_cmds: ["module load quantum-espresso"]
    node_info:
      ranks_per_system: 32
      ranks_per_node: 32
      threads_per_rank: 1
      max_mem_per_node: "128GB"
    pwx_path: "/path/to/pw.x"
    pp_path: "/path/to/pseudopotentials"
    pseudo_dict:
      H: "H.pbe-rrkjus_psl.1.0.0.UPF"
      O: "O.pbe-n-kjpaw_psl.1.0.0.UPF"

3. Custom Workflow Implementation

Extend the base class for specialized workflows:

from alomancy.core import BaseActiveLearningWorkflow
from ase import Atoms
import pandas as pd

class CustomActiveLearningWorkflow(BaseActiveLearningWorkflow):

    def train_mlip(self, base_name: str, mlip_committee_job_dict: dict, **kwargs):
        """Custom MLIP training implementation"""
        # Your custom training logic here
        return "path/to/trained/model.pt"

    def evaluate_mlip(self, mlip_committee_job_dict: dict, **kwargs) -> pd.DataFrame:
        """Custom model evaluation"""
        # Your evaluation logic here
        return pd.DataFrame({"rmse": [0.1], "mae": [0.05]})

    def generate_structures(self, base_name: str, job_dict: dict,
                          train_data: list[Atoms], **kwargs) -> list[Atoms]:
        """Custom structure generation"""
        # Your structure generation logic here
        return generated_structures

    def high_accuracy_evaluation(self, base_name: str,
                               high_accuracy_eval_job_dict: dict,
                               structures: list[Atoms], **kwargs) -> list[Atoms]:
        """Custom high-accuracy evaluation"""
        # Your high-accuracy calculation logic here
        return evaluated_structures

📚 Examples

Check out the examples/ directory for complete workflow examples:

  • Basic Usage: Simple active learning workflow setup
  • Custom HPC Configuration: Advanced cluster configuration
  • Analysis Scripts: Post-processing and visualization tools

🏗️ Project Structure

alomancy/
├── analysis/           # Analysis and visualization tools
├── configs/           # Configuration management
├── core/              # Core active learning framework
├── high_accuracy_evaluation/  # DFT calculation modules
├── mlip/              # Machine learning potential training
├── structure_generation/      # MD and structure generation
└── utils/             # Utility functions and helpers

🔧 Key Components

Core Framework

  • BaseActiveLearningWorkflow: Abstract base class for AL workflows
  • StandardActiveLearningWorkflow: Ready-to-use implementation

MLIP Training

  • MACE Integration: Committee training with uncertainty quantification
  • Remote Submission: HPC job management for GPU-accelerated training

Structure Generation

  • Molecular Dynamics: ASE-based MD simulations with MACE potentials
  • Uncertainty Sampling: Intelligent structure selection based on model disagreement

High-Accuracy Evaluation

  • Quantum Espresso: Automated DFT calculations for reference data
  • Job Management: Parallel submission and monitoring of DFT jobs

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Setup

# Clone the repository
git clone https://github.com/yourusername/ALomnacy.git
cd ALomnacy

# Install in development mode
pip install -e ".[dev]"

# Run tests
pytest

# Run linting
ruff check .
ruff format .

Running Tests

# Run all tests
pytest

# Run specific test categories
pytest tests/core_tests/
pytest tests/mlip_train_tests/
pytest tests/high_acc_tests/

# Run with coverage
pytest --cov=alomancy

📝 Citation

If you use ALomnacy in your research, please cite:

@software{alomancy2025,
  title={ALomnacy: Modular Active Learning Workflows for Modern Computational Chemistry},
  author={Julian Holland},
  year={2025},
  url={https://github.com/yourusername/ALomnacy},
  version={0.1.0}
}

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • The Fritz Haber Institute

📞 Support


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alomancy-0.1.1.tar.gz (117.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

alomancy-0.1.1-py3-none-any.whl (71.3 kB view details)

Uploaded Python 3

File details

Details for the file alomancy-0.1.1.tar.gz.

File metadata

  • Download URL: alomancy-0.1.1.tar.gz
  • Upload date:
  • Size: 117.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for alomancy-0.1.1.tar.gz
Algorithm Hash digest
SHA256 3f5dc8e20ef80d666488a2c4e4fc4609ac530c12504e66d75a98dc4fc9752995
MD5 9d0832311bc6357358de2155fdc5272d
BLAKE2b-256 ced0c7f119802b4915ed742b12eebc26a9b723abeac83fb9a728ae14ae8b7484

See more details on using hashes here.

File details

Details for the file alomancy-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: alomancy-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 71.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for alomancy-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f8fe507d7effc9b303fe0207a7f18784b0293ada2ddfeb6a0d9fca7aab792c78
MD5 c1e0927e0591883b8c2976d4d2d4dbf9
BLAKE2b-256 f342c0064f25f0b0ac089af2a31984622e9bceb3eaa6f5547abc8ffefc8f23e5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page