Skip to main content

Automatic Refinement of Conformational Ensemble

Project description

Ensemble Analyzer

Python License

Conformer Ensemble Pruning Software

Ensemble Analysis (EnAn) is a Python framework for automated conformational analysis and ensemble processing in computational chemistry workflows.


🎯 Key Features

Core Capabilities

  • Multi-Protocol Workflows: Sequential optimization/frequency calculations with automatic pruning
  • 🔬 Quantum Chemistry Integration: Support for ORCA and Gaussian
  • 📊 Advanced Clustering: PCA-based conformer clustering with multiple feature extraction methods
  • 🎨 Spectral Analysis: Generate weighted IR, VCD, UV-vis, and ECD spectra
  • 🔄 Checkpoint System: Automatic restart capability with atomic file operations

Analysis Tools

  • Thermochemistry: Grimme's qRRHO implementation for every integration
  • Clustering Method: Distance matrix eigenvalues (rotation/translation invariant).
  • Pruning: Intelligent energy-based filtering with Boltzmann weighting

📦 Installation

Requirements

  • Python ≥ 3.11
  • Core: NumPy, SciPy, Matplotlib, ASE
  • Clustering: scikit-learn
  • Acceleration: Numba
pip install ensemble-analyzer
  • Install ORCA from the ORCA Forum

  • Optional: Install Gaussian, if licensed

  • Export ORCA verion

export ORCAVERSION="x.y.z"

🚀 Quick Start

1. Define your protocol file

Create protocol.json

{
    "0": {"funcional": "r2SCAN-3c", "opt": true, "freq": true,"cluster": 5, "comment": "Initial Optimization cluster into 5 families"},
    "1": {"funcional": "wB97X-D4rev", "basis": "def2-QZVPPD", "comment": "Single Point energy evaluation"}
}

2. Basic Usage

ensemble_analyzer --ensemble conformers.xyz --protocol protocol.json --output calculation.out --cpu 8 --temperature 298.15

3. Restart from Checkpoint

# Automatically resumes from last completed protocol
ensemble_analyzer --restart

Protocol Parameters

Parameter Type Description Example
Calculation Settings
functional str DFT functional or method "B3LYP", "xtb", "HF-3c"
basis str Basis set (auto for composite methods) "def2-SVP", "def2-TZVP"
calculator str QM program "orca" (default), "gaussian"
opt bool Optimize geometry true, false
freq bool Calculate frequencies true, false
mult int Spin multiplicity 1 (singlet), 2 (doublet)
charge int Molecular charge
calculator str Set the calculator orca (default), gaussian
Pruning Thresholds
thrG float Energy similarity threshold [kcal/mol] 3.0, 5.0
thrB float Rotatory constant threshold [cm⁻¹] 30.0, 50.0
thrGMAX float Energy window cutoff [kcal/mol] 10.0
cluster bool/int Enable clustering true (auto), 5 (fixed)
no_prune bool Disable pruning false (default)
Advanced
solvent dict Implicit solvation {"solvent": "water", "model": "SMD"}
constrains list Geometry constraints (only on cartesians) [1,2] (fix cartesians)
monitor_internals list Track bond/angle/dihedral [[0,1], [0,1,2]]
skip_opt_fail bool Skip failed optimizations false (default)
block_on_retention_rate bool Block the calculation has a retention rate lower than the MIN_RETENTION_RATE (20%) false (default)

Standalone CLI applications

Protocol Wizard

If you don't want to create from scratch the protocol file, the enan_protocol_wizard is an automatic and interactive way to create this essential file. It is divided into three different level of configuration: i) basic, where all the essential parameters are asked, ii) intermediate, and iii) advance. Feel free to browse all the possible protocol options and parameters.

Regrapher

If you want to change the convolution of your graphs, you can edit the setting.json file. Here, all the global settings are present. When finished, the command enan_regraph come handy. It re-run the Graph workflow with these new setting and in few time, you'll have your new graphs.

usage: enan_regraph [-h] [-rb READ_BOLTZ] [-no-nm] [-w] [--disable-color] idx [idx ...]

positional arguments:
  idx                   Protocol's number to (re-)generate the graphs

options:
  -h, --help            show this help message and exit
  -rb READ_BOLTZ, --read-boltz READ_BOLTZ
                        Read Boltzmann population from a specific protocol
  -no-nm, --no-nm       Do not save the nm graphs
  -w, --weight          Show Weighting function
  --disable-color       Disable colored output

Graph Editor

All graphs are saved also as a pickle. This file can be reloaded and from there you can modify every single element of the Matplotlib Figure store in it. This requires some programming skills and, especially, time. Here is where enan_graph_editor comes to play. It is once again an interactive terminal interface (based both on rich or InquierPy library) where you can change and personalize every pickle. If you have to modify more files at once, a batch mode is implemented as well, so to by-pass the limitation of the manual selection of the interactive TUI.

usage: enan_graph_editor [-h] [--batch] [--list] [--rename OLD NEW] [--rename-file RENAME_FILE] [--color LABEL COLOR] [--linestyle LABEL STYLE] [--linewidth LABEL WIDTH] [--alpha LABEL ALPHA] [--output OUTPUT] [--format {pickle,png,pdf,svg}] [--preview] [--no-strict]
                         [--verbose]
                         pickle_file

Interactive/batch editor for matplotlib pickles

positional arguments:
  pickle_file           Matplotlib pickle file

options:
  -h, --help            show this help message and exit
  --batch, -b           Batch mode (non-interactive)
  --no-strict           Disable strict validation
  --verbose, -v         Verbose output

batch mode options:
  --list, -l            List labels and exit
  --rename OLD NEW, -r OLD NEW
                        Rename label
  --rename-file RENAME_FILE, -rf RENAME_FILE
                        Mapping file OLD=NEW
  --color LABEL COLOR, -c LABEL COLOR
                        Change colour
  --linestyle LABEL STYLE, -ls LABEL STYLE
                        Change line linestyle (e.g., -, --, :, -. )
  --linewidth LABEL WIDTH, -lw LABEL WIDTH
                        Change line linewidth (float)
  --alpha LABEL ALPHA, -a LABEL ALPHA
                        Change line transparency (0-1)
  --output OUTPUT, -o OUTPUT
                        Output file
  --format {pickle,png,pdf,svg}, -f {pickle,png,pdf,svg}
                        Output format
                        
DEFAULT MODE: Interactive TUI
  python enan_graph_editor plot.pkl

BATCH MODE (examples):
  python enan_graph_editor plot.pkl --batch --list
  python enan_graph_editor plot.pkl --batch --rename "Protocol 1" "Proto A"
  python enan_graph_editor plot.pkl --batch --color "Experimental" red --output new.pkl

🤝 Contributing

Development Workflow

# Fork and clone
git clone https://github.com/andre-cloud/ensemble_analyzer.git
cd ensemble_analyzer

# Create feature branch
git checkout -b feature/awesome-feature

# Install dev dependencies
pip install -e .

# Commit and push
git commit -m "Add awesome feature"
git push origin feature/awesome-feature

📄 License

This software is licensed under the MIT-3.0 License. See the LICENSE file for details.

📞 Support and contact

For any questions or support, please contact by email.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ensemble_analyzer-1a0.tar.gz (104.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ensemble_analyzer-1a0-py3-none-any.whl (107.9 kB view details)

Uploaded Python 3

File details

Details for the file ensemble_analyzer-1a0.tar.gz.

File metadata

  • Download URL: ensemble_analyzer-1a0.tar.gz
  • Upload date:
  • Size: 104.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for ensemble_analyzer-1a0.tar.gz
Algorithm Hash digest
SHA256 8b449f4e53b0001534fd647dd7b3c284f8afceed6a59e4baa6d88a65b573efbf
MD5 2904bb1bbe767efd856642e8f0c9c974
BLAKE2b-256 0d63cf1a0613e3ad29516deb7a63964978bd4a5d471a619658cd22cc4080772d

See more details on using hashes here.

File details

Details for the file ensemble_analyzer-1a0-py3-none-any.whl.

File metadata

  • Download URL: ensemble_analyzer-1a0-py3-none-any.whl
  • Upload date:
  • Size: 107.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for ensemble_analyzer-1a0-py3-none-any.whl
Algorithm Hash digest
SHA256 2b78792e8ca263ce740e4b1325996412283280d512c331bb2a08cbb4c0b2b06e
MD5 b603161caded4ab1d282773ca249495b
BLAKE2b-256 1b2a050bf7ab371152904d1aba432aa4de44ca550c15518c50486d8a284fcf4b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page