Reproducible, tailorable archives for computational chemistry
Project description
reptar
Reproducible, tailorable archive
Motivation • About • License
Motivation
Computational chemistry is falling behind in providing open raw and processed data used to draw scientific conclusions. Often it is the lack of time, expertise, and options that push researchers to overlook the importance of reproducible findings. Projects such as QCArchive, Materials Project, Pitt Quantum Repository, ioChem-BD and many others provide a rigid data framework for a specific purpose (e.g., quantum chemistry and material properties). In other words, data that does not directly fit into their paradigm are incompatible—for good reason.
Alternatively, you could use file formats such as JSON, XML, YAML, npz, etc. for a specific project. These are great options for customizable data storage with their own advantages and disadvantages. However, you often must choose between (1) a standardized parser that might not support your workflow or (2) writing your own.
Reptar provides customizable parsers and data storage frameworks for whatever an individual project demands. Data is stored in one of the supported file types and generalized routines are used to access and store data. All data is stored in a key-value pair format where users can use predetermined definitions or include their own. Regardless if you are running nudged elastic band calculations in VASP, free energy perturbation simulations in GROMACS, or gradient calculations in Psi4, you can store data easily with reptar by selecting a parser a specifying the desired file type. The result is a user-specified data file streamlined for analysis in Python and optimized for archival on places such as GitHub and Zenodo.
About
Reptar is essentially a collection of tools for managing computational chemistry data using the a variety of formats like JSON, npz, and exdir.
Exdir is a simple, yet powerful open file format that mimics the HDF5 format with metadata and data stored in directories with YAML and npy files instead of a single binary file. This provides several advantages such as mixing human-readable YAML and binary NumPy files, being easier for version control, and only loading requested portions of datasets into memory. For more detailed information, please read this Front. Neuroinform. article about exdir.
License
Distributed under the MIT License. See LICENSE
for more information.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file reptar-0.0.1.tar.gz
.
File metadata
- Download URL: reptar-0.0.1.tar.gz
- Upload date:
- Size: 44.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5fbd838f3b6ea8bdecbe3054d500736d6d997253240e40e4bd75836e3e7670ee |
|
MD5 | 9f5388f5b84890885dcea62d01f68025 |
|
BLAKE2b-256 | 3701ecb281928b5be5e8ee45c49d150a384c4935c998a15e349a47d5e7d5e431 |