Skip to main content

A Python library for automatic calculation of quantum chemical descriptors for QSAR analysis using ORCA quantum chemistry software

Project description

logo

ORCA Descriptors

A Python library for automatic calculation of quantum chemical descriptors for QSAR analysis using ORCA quantum chemistry software.

Installation

Using pip

pip install orca-descriptors

Note: After installation, the orca_descriptors command-line tool will be available in your PATH. If you installed with pip install --user, you may need to add ~/.local/bin to your PATH:

# For bash/zsh (add to ~/.bashrc or ~/.zshrc)
export PATH="$HOME/.local/bin:$PATH"

# For fish shell (add to ~/.config/fish/config.fish)
set -gx PATH $HOME/.local/bin $PATH

After adding to PATH, restart your terminal or run source ~/.bashrc (or source ~/.zshrc).

Using Poetry (development)

poetry install

Usage

As a Python Library

from orca_descriptors import Orca
from rdkit.Chem import MolFromSmiles, AddHs

# Initialize ORCA calculator
orca = Orca(
    script_path="orca",
    functional="PBE0",
    basis_set="def2-SVP",
    method_type="Opt",
    dispersion_correction="D3BJ",
    solvation_model="COSMO(Water)",
    n_processors=8,
)

# Create molecule from SMILES using RDKit
mol = AddHs(MolFromSmiles("C1=CC=CC=C1"))

# Calculate descriptors
homo = orca.homo_energy(mol)
lumo = orca.lumo_energy(mol)
gap = orca.gap_energy(mol)

# Additional descriptors
homo_minus_1 = orca.mo_energy(mol, index=-2)  # HOMO-1 energy
min_h_charge = orca.get_min_h_charge(mol)  # Minimum H charge
xy_shadow = orca.xy_shadow(mol)  # XY projection area
meric = orca.meric(mol)  # Electrophilicity index
logp = orca.m_log_p(mol)  # Log P coefficient
nrot = orca.num_rotatable_bonds(mol)  # Rotatable bonds
wiener = orca.wiener_index(mol)  # Wiener index
sasa = orca.solvent_accessible_surface_area(mol)  # SASA

As a Command-Line Utility

After installation, you can use orca_descriptors as a command-line tool:

Run Benchmark

Calibrate time estimation by running a benchmark calculation. The benchmark uses benzene (C1=CC=CC=C1) as a standard test molecule for machine calibration:

orca_descriptors run_benchmark

Estimate Calculation Time

Estimate calculation time for a molecule without running the actual calculation:

orca_descriptors approximate_time --molecule C1=CC=CC=C1

Automatic Parameter Scaling: The time estimation automatically scales benchmark data for different parameters (number of processors, functional, basis set). You don't need to re-run the benchmark if you change these parameters - the system will automatically recalculate the estimated time based on the existing benchmark data.

For example:

  • If benchmark was run with 1 processor, estimation for 4 processors will automatically account for parallel efficiency
  • If benchmark used def2-SVP, estimation for def2-TZVP will scale based on basis set size (O(N^3.5) scaling)
  • Different functionals are scaled based on their relative computational costs

Available Parameters

All parameters from the Orca class are available as command-line arguments:

  • --script_path: Path to ORCA executable (default: 'orca')
  • --working_dir: Working directory for calculations (default: current directory)
  • --output_dir: Directory for output files (default: current directory)
  • --functional: DFT functional (default: PBE0)
  • --basis_set: Basis set (default: def2-SVP)
  • --method_type: Calculation type: Opt, SP, or Freq (default: Opt)
  • --dispersion_correction: Dispersion correction, e.g., D3BJ (default: D3BJ). Use 'None' to disable.
  • --solvation_model: Solvation model, e.g., 'COSMO(Water)' (default: None). Use 'None' to disable.
  • --n_processors: Number of processors (default: 1)
  • --max_scf_cycles: Maximum SCF cycles (default: 100)
  • --scf_convergence: SCF convergence threshold (default: 1e-6)
  • --charge: Molecular charge (default: 0)
  • --multiplicity: Spin multiplicity (default: 1)
  • --cache_dir: Directory for caching results (default: output_dir/.orca_cache)
  • --log_level: Logging level: DEBUG, INFO, WARNING, ERROR (default: INFO)
  • --max_wait: Maximum time to wait for output file creation in seconds (default: 300)

Example Commands

# Run benchmark with custom parameters (uses benzene as standard test molecule)
orca_descriptors run_benchmark \
    --functional PBE0 \
    --basis_set def2-SVP \
    --n_processors 4 \
    --working_dir ./calculations

# Estimate time for optimization calculation
orca_descriptors approximate_time \
    --molecule CCO \
    --method_type Opt \
    --n_opt_steps 20 \
    --functional PBE0 \
    --basis_set def2-TZVP \
    --n_processors 8

Requirements

  • Python >= 3.10
  • ORCA 6.0.1 installed and available in PATH
  • RDKit >= 2023.0.0

License

See LICENSE.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orca_descriptors-0.3.3b2.tar.gz (55.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

orca_descriptors-0.3.3b2-py3-none-any.whl (64.7 kB view details)

Uploaded Python 3

File details

Details for the file orca_descriptors-0.3.3b2.tar.gz.

File metadata

  • Download URL: orca_descriptors-0.3.3b2.tar.gz
  • Upload date:
  • Size: 55.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for orca_descriptors-0.3.3b2.tar.gz
Algorithm Hash digest
SHA256 e1ce7aab806af3f94b6c563db5833d94cf073788d4c7bcb4443fbac9c4f92b0d
MD5 51a4a774f1648a8d63b5d84284f77082
BLAKE2b-256 04d6911493c1c87bd91e05b461835ac9480ff1c8e257271360e1a1a0ec88d917

See more details on using hashes here.

Provenance

The following attestation bundles were made for orca_descriptors-0.3.3b2.tar.gz:

Publisher: publish.yml on MassonNN/orca_descriptors

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file orca_descriptors-0.3.3b2-py3-none-any.whl.

File metadata

File hashes

Hashes for orca_descriptors-0.3.3b2-py3-none-any.whl
Algorithm Hash digest
SHA256 c9efe40f5c82e2fef0ca185d754578924e478bde39cf510fb84623369bf53f12
MD5 86d513fc27ddd2c72ce2c8782a48862d
BLAKE2b-256 945fe8b34c9c749349a56c78eaa1003b8ca7629be4207b02bff2c4375ce52d9f

See more details on using hashes here.

Provenance

The following attestation bundles were made for orca_descriptors-0.3.3b2-py3-none-any.whl:

Publisher: publish.yml on MassonNN/orca_descriptors

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page