Skip to main content

Reproducibility simplified.

Project description

Calkit

Documentation | Tutorials | Discussions

Calkit makes it easy to create "single button" reproducible research projects.

Instead of a loosely related collection of files and manual instructions, turn your project into a version-controlled, self-contained "calculation kit," tying together all phases or stages of the project: data collection, analysis, visualization, and writing, each of which can make use of the latest and greatest computational tools and languages. In other words, you, your collaborators, and readers will be able to go from raw data to research article with a single command, improving efficiency via faster iteration cycle time, reducing the likelihood of mistakes, and allowing others to more effectively build upon your work.

Calkit makes this level of automation possible without extensive software engineering expertise by providing a project framework and toolset that unifies and simplifies the use of powerful enabling technologies like Git, DVC, Conda, Docker, and more, while guiding users away from common reproducibility pitfalls.

Features

  • A declarative pipeline that guides users to define an environment for every stage, so long lists of instructions in a README and "but it works on my machine" are things of the past.
  • A CLI to run the project's pipeline to verify it's reproducible, regenerating outputs as needed and ensuring all computational environments (e.g., Conda, Docker, uv, Julia) match their specification.
  • A schema to store structured metadata describing the project's important outputs (in its calkit.yaml file) and how they are created (its computational environments and pipeline).
  • A command line interface (CLI) to simplify keeping code, text, and larger data files backed up in the same project repo using both Git and DVC.
  • A complementary self-hostable and GitHub-integrated cloud system to facilitate backup, collaboration, and sharing throughout the entire research lifecycle.
  • Overleaf integration, so code, data, and LaTeX documents can all live in the same repo and be part of a single pipeline (no more manual uploads!)

Installation

On Linux, macOS, or Windows Git Bash, install Calkit and uv (if not already installed) with:

curl -LsSf install.calkit.org | sh

Or with Windows Command Prompt or PowerShell:

powershell -ExecutionPolicy ByPass -c "irm install-ps1.calkit.org | iex"

If you already have uv installed, install Calkit with:

uv tool install calkit-python

You can also install with your system Python:

pip install calkit-python

To effectively use Calkit, you'll want to ensure Git is installed and properly configured. You may also want to install Docker, since that is the default method by which LaTeX environments are created. If you want to use the Calkit Cloud for collaboration and backup as a DVC remote, you can set up cloud integration.

Use without installing

If you want to use Calkit without installing it, you can use uv's uvx command to run it directly:

uvx calk9 --help

Calkit Assistant

For Windows users, the Calkit Assistant app is the easiest way to get everything set up and ready to work in VS Code, which can then be used as the primary app for working on all scientific or analytical computing projects.

Calkit Assistant

Quickstart

From an existing project

If you want to use Calkit with an existing project, navigate into its working directory and use the xr command to start executing and recording your scripts, notebooks, LaTeX files, etc., as reproducible pipeline stages. For example:

calkit xr scripts/analyze.py

calkit xr notebooks/plot.ipynb

calkit xr paper/main.tex

Calkit will attempt to detect environments, inputs, and outputs and save them in calkit.yaml. If successful, you'll be able to run the full pipeline with:

calkit run

Next, make a change to e.g., a script and look at the output of calkit status. You'll see that the pipeline has a stage that is out-of-date:

---------------------------- Pipeline ----------------------------
analyze:
        changed deps:
                modified:           scripts/analyze.py

This can be fixed with another call to calkit run.

You can save (add and commit) all changes with:

calkit save -am "Add to pipeline"

Fresh from a Calkit project template

Create a new project from the calkit/example-basic template with:

calkit new project my-research \
    --title "My research" \
    --template calkit/example-basic \
    --cloud

Note the --cloud flag requires cloud integration to be set up, but can be omitted if the project doesn't need to be backed up to the cloud or shared with collaborators. Cloud integration can also be set up later.

Next, move into the project folder and run the pipeline, which consists of several stages defined in calkit.yaml:

cd my-research
calkit run

Next, make some edits to a script or LaTeX file and run calkit status to see what stages are out-of-date. For example:

---------------------------- Pipeline ----------------------------
build-paper:
        changed deps:
                modified:           paper/paper.tex

Execute calkit run again to bring everything up-to-date.

To back up or save the project, call:

calkit save -am "Run pipeline"

Get involved

We welcome all kinds of contributions! See CONTRIBUTING.md to learn how to get involved.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

calkit_python-0.37.0.tar.gz (9.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

calkit_python-0.37.0-py3-none-any.whl (427.6 kB view details)

Uploaded Python 3

File details

Details for the file calkit_python-0.37.0.tar.gz.

File metadata

  • Download URL: calkit_python-0.37.0.tar.gz
  • Upload date:
  • Size: 9.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for calkit_python-0.37.0.tar.gz
Algorithm Hash digest
SHA256 f8742c4d1782aa6dc91cf2450fbf14a45e245b3cef4681e48af6586e55ffe79c
MD5 e6f3ab22d54d218d330cb67927d4ce5a
BLAKE2b-256 c97a57eb9aaf43a71844e3a4f0473865e659de4a267362f3023317280ddce287

See more details on using hashes here.

Provenance

The following attestation bundles were made for calkit_python-0.37.0.tar.gz:

Publisher: publish.yml on calkit/calkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file calkit_python-0.37.0-py3-none-any.whl.

File metadata

  • Download URL: calkit_python-0.37.0-py3-none-any.whl
  • Upload date:
  • Size: 427.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for calkit_python-0.37.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d1635ff2368f5ec7e2622de1a28fad71bf14e5adb77b6f408c431e7d4d3f7b7c
MD5 a5563b8bdc03a17814412ba4fbf1d33f
BLAKE2b-256 f7cea594d151717a207dad7d70a5ee87c068b3eebd97376ec644a59ed1715e73

See more details on using hashes here.

Provenance

The following attestation bundles were made for calkit_python-0.37.0-py3-none-any.whl:

Publisher: publish.yml on calkit/calkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page