Parse computational chemistry log files, but fast-ly.
Project description
FastLogfileParser
Parse logfiles from computational chemistry software, but fast-ly
Install with pip install fastlogfileparser
or conda
(forthcoming!).
- ~10x faster than
cclib
- zero dependencies, supports all modern Python version
- Supports linked jobs, returns a separate result dictionary for each job
- Retrieves values at every step, not just convergence
Usage
The best way to see how fastlogfileparser
works is to check out the tests!
They show the syntax for importing, calling, and then accessing the values.
A brief summary of overall workflow and usage is provided below.
Design
There is a single function fast_{software}_logfile_parser
inside fastlogfileparser.{software}
(where {software}
is the name of the corresponding package like gaussian
or orca
) which reads log files and returns the result as a namedtuple (which prevents accidentally changing the values and allows using .
syntax to access them).
Usage Example
from fastlogfileparser.gaussian import fast_gaussian_logfile_parser as fglp
# read all jobs from the logfile
job_1, job_2, ..., job_n = fglp("data/result.log")
# access results
print(job_1.frequency_modes)
# show all available values retrieved from the file
print(job_1._fields)
# can also be accessed via
from fastlogfileparser.gaussian import ALL_FIELDS
Fast logfile parser is fastest when you ask it to retrieve only the fields you want, i.e.:
job_1, job_2, job_3 = fglp(FNAME, get=("gibbs", "scf"))
Retrieved Values
Gaussian
Quantity | Key | Type | Frequency |
---|---|---|---|
Route Section | route_section |
string | 1/job |
Normal Termination | normal_termination |
boolean | 1/job |
Error | error_string |
str | 1/job |
Maximum Allowed Steps | max_steps |
int | 1/job |
CPU Time | cpu_time |
float | 1/job |
Wall Time | wall_time |
float | 1/job |
Gibbs free energy at 298K | gibbs |
float | 1/job |
Gibbs free energy at 0K | e0_zpe |
float | 1/job |
Enthalpy at 298K | e0_h |
float | 1/job |
HF $^1$ | hf |
float | 1/job |
Per-atom Zero Point Energy | zpe_per_atom |
float | 1/job |
Wavefunction Energy $^3$ | wavefunction_energy |
float | 1/job |
SCF Energy | scf |
list[float] | 1/job |
Vibrational Frequencies | frequencies |
list[float] | 1/job |
Frequency Modes | frequency_modes |
list[list[float]] | 1/job |
Standardized xyz coords | std_xyz |
list[list[float]] | 1/step/job |
Input xyz coords | xyz |
list[list[float]] | 1/step/job |
Standardized forces | std_forces |
list[list[float]] | 1/step/job |
Mulliken Charges (Summed into Heavy) | mulliken_charges_summed |
list[list[float]] | 2/job |
Charge and Multiplicity | charge_and_multiplicity |
list[int] | 1/job |
Number of Atoms $^2$ | number_of_atoms |
int | 1/job |
Number of Optimization Steps $^2$ | number_of_optimization_steps |
int | 1/job |
$1$ equals E0 only for non-wavefunction methods
$2$ requires std_xyz
to be parsed to find these values
$3$ E0 for wavefunction methods
Orca
Quantity | Key | Type | Frequency |
---|---|---|---|
Route Section | route_section |
string | 1/job |
Total Run Time $^1$ | run_time |
float | 1/job |
Charge and Multiplicity | charge_and_multiplicity |
list[int] | 1/job |
Final Single Point Energy | energy |
float | 1/job |
Input xyz coords | input_coordinates |
list[list[float]] | 1/job |
$1$ ignores milliseconds
How much fast-ly-er?
FastLogfileParser
uses REGEX and only REGEX to retrieve data from logfiles, spending as much time in Python's excellent C-based REGEX library as possible.
See comparison.py
to run for yourself (install with pip install .[demos]
), but in short:
- compared to
cclib
,fastlogfileparser
is ~10x as fast and returns all values for intermediate steps in simulation (butcclib
supports retrieving a different set of values) - compared to
ase
,fastlogfileparser
is ~2x slower, but returns far more values and in a more readily accessible format
Development Notes
FastLogfileParser
is written in a purely functional style.
Running Tests
Install FastLogfileParser
with the optional [dev]
dependencies, i.e. from a local clone run `pip install -e ".[dev]"
Rather than keep the gigantic log files saved in the git repo directly, they are compressed to make cloning easier.
Before running tests, navigate to test
and run python data_loader.py decompress
to prepare the needed logfiles.
To add new data for the tests to the repo, perform the previous step and then run python data_loader.py compress
.
This may take some time to finish executing (a minute or so).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fastlogfileparser-1.0.0a7.tar.gz
.
File metadata
- Download URL: fastlogfileparser-1.0.0a7.tar.gz
- Upload date:
- Size: 10.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4a47ebaec452fdb0f80bc429e8d771ed2f3c3014d01cf0c8c7390e0bef74a840 |
|
MD5 | 15620db1709b0a6bcc2e91e90c92d9ab |
|
BLAKE2b-256 | ac4cf51a388123377ef882d7a84c9665073045ab1773cfa0097b6fccf4039e91 |
File details
Details for the file fastlogfileparser-1.0.0a7-py3-none-any.whl
.
File metadata
- Download URL: fastlogfileparser-1.0.0a7-py3-none-any.whl
- Upload date:
- Size: 12.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2d6f66fd554971ba6594fc0a593eea4bf2e746cbb24bbe573e04fd2c6d3a3ea5 |
|
MD5 | 10025721ef3ec6c0c7404a7ca41b4ae0 |
|
BLAKE2b-256 | 8c07596ad3e49c57754262cdc42ab86cf492641b0b4f6e379e47e045e62e0665 |