Skip to main content

Knowledge management app to interface with quantum chemical calculations

Project description

Quantum Chemistry Database

Python app to build a knowledge database from a directory structure of ORCA calculations.

The database takes the form of a directory structure of markdown files, representing structure and information on the calculations. This allows establishing relationships between calculations and forms a lab notebook type structure.

Installation

There is nothing to install - given you're using uv. The QCDB tool is on PyPI, so you can use it asynchroniously with uv tool run (uvx for short):

uvx qcdb --root /path/to/project/root --vault-dir /path/to/vault

To inspect the vault, download and use Obsidian and use it to inspect the vault.

Architecture

The scraper can be updated in the future to

  1. support more QC codes
  2. extract more metadata
  3. automatically prepare dashboards for e.g. SCF or optimization convergence, IR spectra, ..
  4. be parallel

The idea is that we can scrape arbitrary directory structures and "load them into the database" - except here we use a managed directory structure (the "vault") with plain Markdown.

This "vault" is the database.

We don't store the actual calculation's data (too big), but represent it in our knowledge graph as a Markdown (.md) note with the automatically extracted metadata and links to other notes that e.g. use the same geometry.

All the metadata is stored in the YAML frontmatter of each note:

---
charge: 0
mult: 1
date: '2024-06-05'
geometry: '47 ...'
input: '! D4 TPSSh ..'
..
---

# Notes

Experimenting with increased grid size to converge [[link_to_previous, failed calc]]
...

The beauty is that we retain the information from failed experiments as well, without needing to hold onto the raw data itself. We only keep the knowledge we gained from the experiment.

By (automatically) creating provenance trees (i.e. calculation "timelines", what led to what etc) we can, in the future, pick a specific result and export its provenance tree only. This way we can fully reproduce a given result with minimal needed experiments.

Because all the notes are plain .md files, the whole vault can be versioned (and shared?) with git.

The links are also important for e.g. selecting a bunch of calculations, and then exporting them as a fully self-contained vault, or building a citation bibliography from them, or compiling geometry tables for the Supporting Information, or archiving them for long-term storage, ...

By defining custom code-fences, one can create new experiments direclty in the vault and then have a runner run them. Imagine a code block like

! D4 TPSSh OPT FREQ

and linking a geometry. Then run e.g. uvx qcdb --run /path/to/vault to detect and run such experiments automatically.

One also gets to use the whole Obsidian ecosystem to directly work with the knowledge graph:

  1. use regular markdown notes to link e.g. meeting notes or papers with experiments
  2. write short notes on insights gained, and link them - create a network of knowledge, and how it came to be (reproducibly!)
  3. use obsidian bases or the dataview plugin to create tables straight from the notes
  4. use excalidraw plugin for visual note taking and experiment design

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qcdb-0.2.0.tar.gz (54.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qcdb-0.2.0-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file qcdb-0.2.0.tar.gz.

File metadata

  • Download URL: qcdb-0.2.0.tar.gz
  • Upload date:
  • Size: 54.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.9

File hashes

Hashes for qcdb-0.2.0.tar.gz
Algorithm Hash digest
SHA256 13d6dfdd3eddbc2737e3a99b79f3b55a1102c36e3e483d0f608c37d1ff0b49c1
MD5 a9e6df6856af1f147ed3bb52865d7a19
BLAKE2b-256 3f36a84086efc8e478cf5b15fb92eb7e756afacba8861a857d6c5c036b8b1a27

See more details on using hashes here.

File details

Details for the file qcdb-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: qcdb-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 6.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.9

File hashes

Hashes for qcdb-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7867b723b0c0b251dbad7709be85c04af27b7e80505fbba341f95b356c7d63d3
MD5 b50a7a1f1826ba3b8b9d02ef52da9fae
BLAKE2b-256 4b3624547afde5bd85e18295854875f8d27b5b9a00f333e60e6ead9273138906

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page