Skip to main content

A solution for research data management

Project description

The CADET-Research Data Management toolbox

Getting started

Installation

CADET-RDM can be installed using

pip install cadet-rdm

Initialize Project Repository

Create a new project repository or convert an existing repository into a CADET-RDM repo:

cadet-rdm initialize-repo <path-to-repo>

or from python

from cadetrdm import initialize_repo

initialize_repo(path_to_repo)

The output_folder_name can be given optionally. It defaults to output.

Use CADET-RDM in Python

Tracking Results

from cadetrdm import ProjectRepo

"""
Your imports and function declarations
e.g. generate_data(), write_data_to_file(), analyse_data() and plot_analysis_results()
"""

if __name__ == '__main__':
    # Instantiate CADET-RDM ProjectRepo handler
    repo = ProjectRepo()

    # If you've made changes to the code, commit the changes
    repo.commit("Add code to generate and analyse example data")

    # Everything written to the output_folder within this context manager gets tracked
    # The method repo.output_data() generates full paths to within your output_folder
    with repo.track_results(results_commit_message="Generate and analyse example data"):
        data = generate_data()
        output_filepath = repo.output_data(sub_path="raw_data/data.csv")
        write_data_to_file(data, output_filepath)

        analysis_results = analyse_data(data)
        figure_path = repo.output_data("analysis/regression.png")
        plot_analysis_results(analysis_results, figure_path)

Sharing Results

To share your project code and results with others, you need to create remote repositories on e.g. GitHub or GitLab. You need to create a remote for both the project repo and the results repo.

Once created, the remotes need to be added to the local repositories.

cadet-rdm add-remote-to-repo git@<my_git_server.foo>:<project>.git
cadet-rdm --path_to_repo output add-remote-to-repo git@<my_git_server.foo>:<project>_output.git

or in Python:

repo = ProjectRepo()
repo.add_remote("git@<my_git_server.foo>:<project>.git")
repo.output_repo.add_remote("git@<my_git_server.foo>:<project>_output.git")

Once remotes are configured, you can push all changes to the project repo and the results repos with the command

# push all changes to the Project and Output repositories with one command:
repo.push()

Re-using results from previous iterations

Each result stored with CADET-RDM is given a unique branch name, formatted as: <timestamp>_<output_folder>_"from"_<active_project_branch>_<project_repo_hash[:7]>

With this branch name, previously generated data can be loaded in as input data for further calculations.

cached_array_path = repo.input_data(branch_name=branch_name, source_file_path="raw_data/data.csv")

Alternatively, using the auto-generated cache of previous results, CADET-RDM can infer the correct branch name from the path to the file within the cache

cached_array_path = repo.input_data(source_file_path="output_cached/<branch_name>/raw_data/data.csv")

Use CADET RDM from the CLI

Executing scripts

You can execute python files or arbitray commands using the CLI:

cd path/to/your/project
cadet-rdm run-python-file <path/to/file> "commit message for the results"
cadet-rdm run-command "command as it would be run" "commit message for the results"

For the run-command option, the command must be given in quotes, so:

cadet-rdm run-command "python example_file.py" "commit message for the results"

Using results from another repository

You can load in results from another repository to use in your project using the CLI:

cd path/to/your/project
cadet-rdm import-remote-repo <URL> <branch_name>
cadet-rdm import-remote-repo <URL> <branch_name> --target_repo_location <path/to/where/you/want/it>

This will store the URL, branch_name and location in the .cadet-rdm-cache.json file, like this:

{
  "__example/path/to/repo__": {
    "source_repo_location": "git@jugit.fz-juelich.de:IBG-1/ModSim/cadet/agile_cadet_rdm_presentation_output.git",
    "branch_name": "output_from_master_3910c84_2023-10-25_00-17-23",
    "commit_hash": "6e3c26527999036e9490d2d86251258fe81d46dc"
  }
}

You can use this file to load the remote repositories based on the cache.json with

cadet-rdm fill-data-from-cadet-rdm-json

Cloning from remote

You should use cadet-rdm clone instead of git clone to clone the repo to a new location.

cadet-rdm clone <URL> <path/to/repo>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

CADET-RDM-0.0.13.tar.gz (35.6 kB view details)

Uploaded Source

Built Distribution

CADET_RDM-0.0.13-py3-none-any.whl (36.7 kB view details)

Uploaded Python 3

File details

Details for the file CADET-RDM-0.0.13.tar.gz.

File metadata

  • Download URL: CADET-RDM-0.0.13.tar.gz
  • Upload date:
  • Size: 35.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for CADET-RDM-0.0.13.tar.gz
Algorithm Hash digest
SHA256 6f7b7db125c51e21518f01fab9bd3153d2eb93bc04aeb33e69c54a6bf663454b
MD5 1b53be62b5e4837b28656354bc0b5b97
BLAKE2b-256 5f45ee5c96a2a3df27f98e1b69dc14cdffe8c9bedb5e72780d090f87fc6c64a1

See more details on using hashes here.

File details

Details for the file CADET_RDM-0.0.13-py3-none-any.whl.

File metadata

  • Download URL: CADET_RDM-0.0.13-py3-none-any.whl
  • Upload date:
  • Size: 36.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for CADET_RDM-0.0.13-py3-none-any.whl
Algorithm Hash digest
SHA256 7df44bad25139b7c80dc315f05a2a871cf49b88118891e46421e13d34ae1689a
MD5 6563736ede189d0c17c062ab3ee58681
BLAKE2b-256 8f453bc410d79182b9598292f81eebdd51e96d5ec143065dbd73a90959575a76

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page