Skip to main content

Add your description here

Project description

CalcFlow: Quantum Chemistry Calculation I/O Done Right

isChemist Protocol v1.0.0 Ruff ty coverage

CalcFlow provides a robust, Pythonic interface for preparing inputs and parsing outputs for quantum chemistry software like Q-Chem and ORCA. It has zero external dependencies and is built for clarity and reliability. Get your calculations set up and results processed without the usual boilerplate.

Key Features

what you care about:

  • Zero Dependencies: is true now, will be true forever. Integrate into any project without worrying about whether it matches your numpy or rdkit version.
  • Intuitive Pythonic Interface: specify calculation parameters how you think about them and calcflow will handle the translation into what QChem or ORCA expects.
  • Proactive Input Validation: Minimizes runtime errors by rigorously validating all calculation parameters against both general and program-specific constraints.
  • Comprehensive Parsing: stop wasting time on extracting data you need from a txt out file and start doing meaningful work (analysis).

what's nice under the hood:

  • Immutable & Fluent API: ensures data integrity and predictable state transitions via frozen dataclasses and an expressive, chainable API.
  • Fully Type Annotated: code is more robust and easier to understand.
  • Extensible Core Architecture: designed with clear abstract base classes, simplifying the integration of support for additional quantum chemistry programs.
  • Assured Reliability via Comprehensive Testing: so you don't have to worry if the parser is working correctly.

Quick Start

ORCA: Geometry Optimization

Set up an ORCA geometry optimization with frequency calculation for a water molecule:

from calcflow import CalculationInput, Geometry

# 1. Load molecular geometry
water = Geometry.from_xyz_file("tests/testing_data/geometries/1h2o.xyz")

# 2. Configure ORCA calculation - uses fluent API
calc = (
    CalculationInput(
        charge=0, spin_multiplicity=1, task="geometry",
        level_of_theory="wb97x-d3", basis_set="def2-svp", n_cores=16
    )
    .enable_ri_for_orca("RIJCOSX", "def2/j")
    .run_frequency_after_opt()
)

# 3. Export ORCA input file
with open("h2o.inp", "w") as f:
    f.write(calc.export("orca", water))

# 4. Save calculation spec as JSON for reproducibility
with open("h2o_calc_spec.json", "w") as f:
    f.write(calc.to_json())

# 5. Later, load the spec back from JSON
loaded_calc = CalculationInput.from_json(open("h2o_calc_spec.json").read())
assert loaded_calc == calc  # perfect roundtrip!

Q-Chem: TDDFT with SMD Solvation

Set up a Q-Chem TDDFT calculation with implicit solvation:

from calcflow import CalculationInput, Geometry

# 1. Load molecular geometry from XYZ file (recommended)
water = Geometry.from_xyz_file("tests/testing_data/geometries/1h2o.xyz")

# Or define manually:
# from calcflow.common.results import Atom
# atoms = (
#     Atom("O", 0.000000, 0.000000, 0.117300),
#     Atom("H", 0.000000, 0.757200, -0.469200),
#     Atom("H", 0.000000, -0.757200, -0.469200),
# )
# water = Geometry(atoms=atoms, comment="Water molecule")
# 2. Configure Q-Chem calculation
calc = (
    CalculationInput(
        charge=0, spin_multiplicity=1, task="energy",
        level_of_theory="wB97X-D3", basis_set="def2-tzvp", n_cores=16
    )
    .set_tddft(nroots=10, singlets=True, triplets=False)
    .set_solvation(model="smd", solvent="water")
)

# 3. Export Q-Chem input file
with open("h2o_tddft.in", "w") as f:
    f.write(calc.export("qchem", water))

Want to include triplets? Easy modification:

calc2 = calc.set_tddft(nroots=10, singlets=True, triplets=True)

Need an O K-Edge XAS spectrum with element-specific basis?

calc3 = (
    calc.set_tddft(nroots=10, singlets=True, triplets=False)
    .set_basis({"H": "pc-2", "O": "pcX-2"})
    .set_reduced_excitation_space(initial_orbitals=[1])
)

By the way, this pythonic method:

.set_reduced_excitation_space(initial_orbitals=[1])

translates into Q-Chem's cryptic syntax:

$rem
    TRNSS = TRUE
    TRTYPE = 3
    N_SOL = 1
$end

$solute
    1
$end

Honestly, a major reason why this package exists.

Advanced: XAS from Excited State (MOM)

Want XAS spectrum from S1 state? Chain the methods:

calc4 = (
    calc.set_tddft(nroots=10, singlets=True, triplets=False)
    .set_basis({"H": "pc-2", "O": "pcX-2"})
    .set_reduced_excitation_space(initial_orbitals=[1])
    .set_unrestricted()
    .set_mom(transition="HOMO->LUMO")
)

This creates a 2-job input file: initial SCF followed by MOM with HOMO→LUMO transition and TDDFT.

Fun fact: Geometry instances have a .total_nuclear_charge property used to calculate HOMO/LUMO indices. You can also specify transitions numerically (e.g., "5->6", "3->LUMO") or use "->vac" for ionization (e.g., "HOMO->vac").

SLURM Job Submission Scripts

Generate SLURM submission scripts for cluster computing:

from calcflow import CalculationInput, Geometry
from calcflow.slurm import SlurmJob

# 1. Create your calculation
water = Geometry.from_xyz_file("water.xyz")
calc = (
    CalculationInput(
        charge=0, spin_multiplicity=1, task="energy",
        level_of_theory="wB97X-D3", basis_set="def2-tzvp", n_cores=16
    )
    .set_tddft(nroots=10)
    .set_solvation("smd", "water")
)

# 2. Create SLURM job configuration
job = (
    SlurmJob(
        job_name="h2o_tddft",
        time="04:00:00",
        n_cores=16,
        memory_mb=64000,
        partition="gpu",
        account="my_research_group"
    )
    .add_modules(["orca/5.0", "openmpi/4.1.1"])
)

# 3. Generate complete SLURM script
slurm_script = job.export(calc, program="orca", 
                          input_filename="h2o.inp", 
                          output_filename="h2o.out")

# 4. Save script for cluster submission
with open("submit_h2o.sh", "w") as f:
    f.write(slurm_script)

# Now submit: sbatch submit_h2o.sh

The generated SLURM script automatically includes:

  • Proper #SBATCH directives for your HPC scheduler
  • Module loading commands
  • Program-specific parallelism directives (OpenMP/MPI)
  • Correct launch commands for ORCA or Q-Chem

For Q-Chem with MPI parallelism:

calc_mpi = calc.set_options(parallelism="mpi")

job_mpi = (
    SlurmJob(
        job_name="h2o_mpi",
        time="04:00:00",
        n_cores=32,
        memory_mb=128000,
        partition="normal"
    )
    .add_modules(["qchem/6.2", "intel/2021.3"])
)

slurm_script = job_mpi.export(calc_mpi, program="qchem",
                              input_filename="h2o.in",
                              output_filename="h2o.out")

Parsing Output Files

ORCA

from pathlib import Path
from calcflow import parse_orca_output

# Parse ORCA output
result = parse_orca_output(Path("h2o.out").read_text())

# Access results
print(result.final_energy)  # -76.1234567
print(result.scf.n_iterations)  # 12

# Save to JSON (excludes raw_output automatically)
with open("orca_result.json", "w") as f:
    f.write(result.to_json())

# Later, load from JSON (much faster than re-parsing)
from calcflow.common.results import CalculationResult
loaded = CalculationResult.from_json(Path("orca_result.json").read_text())

Q-Chem

from pathlib import Path
from calcflow import parse_qchem_output, parse_qchem_multi_job_output

# Single-job output
result = parse_qchem_output(Path("h2o_sp.out").read_text())

# Multi-job output (e.g., MOM, XAS)
jobs = parse_qchem_multi_job_output(Path("h2o_mom_xas.out").read_text())
job1, job2 = jobs

# Access TDDFT results
print(job2.scf.energy)  # -76.53682225
print(job2.tddft.tda_states[0].excitation_energy_ev)  # 7.42

# Get excitation energies and oscillator strengths
energies = [s.excitation_energy_ev for s in job2.tddft.tda_states]
intensities = [s.oscillator_strength for s in job2.tddft.tda_states]

# Mulliken populations for state 3
mulliken = job2.tddft.transition_dm_analyses[2].mulliken
for pop in mulliken.populations:
    print(f"{pop.symbol}: {pop.transition_charge_e}")

LLM & Programmatic Usage

CalcFlow provides built-in API documentation for easy exploration and LLM-assisted workflows:

# Get comprehensive API documentation for input generation
from calcflow.common.input import CalculationInput
print(CalculationInput.get_api_docs())

# Get comprehensive API documentation for results parsing
from calcflow.common.results import CalculationResult
print(CalculationResult.get_api_docs())

These methods return complete documentation of all available fields, methods, and usage examples - perfect for LLMs or programmatic exploration without needing to read source code.

Quick Testing with uv

Test CalcFlow without installation using one-liners:

# Get input API docs
uv run --with calcflow python -c "from calcflow.common.input import CalculationInput; print(CalculationInput.get_api_docs())"

# Get results API docs
uv run --with calcflow python -c "from calcflow.common.results import CalculationResult; print(CalculationResult.get_api_docs())"

# Quick parse and inspect
uv run --with calcflow python -c "from calcflow.io.qchem import parse_qchem_output; from pathlib import Path; result = parse_qchem_output(Path('calc.out').read_text()); print(f'Energy: {result.final_energy}, Status: {result.termination_status}')"

# Save API docs to file
uv run --with calcflow python -c "from calcflow.common.input import CalculationInput; open('api.txt', 'w').write(CalculationInput.get_api_docs())"

More Examples

For more examples, see scripts/input-orca.py and scripts/parse-orca.py. Test data is in tests/testing_data/.

Which versions of QChem do you support?

It's tested on 6.2 and 5.4. Good news is that I'm intending to make it easy to adjust for version-specific printing. For example,

QChem 5.4. prints final SCF energy as
SCF   energy in the final basis set =      -75.3184602363
while 6.2. prints it as
SCF   energy =   -75.32080770

p.s. nevermind the difference in energy, 5.4. mistakenly prints SCF energy same as Total energy, which includes solvation terms

which is why the calcflow/io/qchem/blocks/scf.py block defines different patterns for different versions:

PatternDefinition(field_name="scf_energy", required=True, description="Final SCF energy value",
versioned_patterns=[
        (re.compile(r"^\s*SCF\s+energy\s*=\s*(-?\d+\.\d+)"), "6.2", lambda m: float(m.group(1))),
        (re.compile(r"^\s*SCF\s+energy in the final basis set\s*=\s*(-?\d+\.\d+)"), "5.4", lambda m: float(m.group(1))),
    ],
),

Contributing

Direct, effective contributions are welcome. Fork, modify, test, and pull request. Adhere to existing quality standards.

Related Works

How is it different from QCEngine?

  • QCEngine requires QCElemental, a dep that is supposed to give you constants and periodic table, but for some reason requires numpy
  • afaict, it also tries to solve a much more difficult and much less relevant problem - automating execution of QC calculations. I don't see how one could anticipate all possible variations in configs of internal clusters. Most importantly, I don't see it as a main problem.

How is it different from qc suite (qcio, qccodec, qcop) by coltonbh?

honestly, qc suite is great and as of now has greater variety of supported QC software and tasks. unfortunately, as is, qcio depends on numpy and qcio/qccodec (which is a direct alternative to calcflow) seem to be too geared for automation with BigChem, which might be a plus for some, but might be too much for people who don't want/need to change their routine and just need easy input prep and output parsing

License

MIT. Pure and simple.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

calcflow-0.4.0.tar.gz (448.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

calcflow-0.4.0-py3-none-any.whl (111.5 kB view details)

Uploaded Python 3

File details

Details for the file calcflow-0.4.0.tar.gz.

File metadata

  • Download URL: calcflow-0.4.0.tar.gz
  • Upload date:
  • Size: 448.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for calcflow-0.4.0.tar.gz
Algorithm Hash digest
SHA256 db684113c6c1408a225305bcb9bf6db9022c739f78d331e9284dd477285927a3
MD5 66c103ede16b52ef46229500e235d60c
BLAKE2b-256 a85cf8833d20e04a3c04f6c0484e80bd7e27ed67e119f3392e73005465d7a1ad

See more details on using hashes here.

File details

Details for the file calcflow-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: calcflow-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 111.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for calcflow-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5782dc15b4744375f15d2465b70a705624282fd22b5867f0f50f105b2c937e96
MD5 7adf558020a01c2c95c4d5f97d924499
BLAKE2b-256 7d1688ea042744fd9c94039765d679b219c69cf0b2efa0c584756a352d95c8c1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page