Easy management of source data, intermediate data, and results for data science projects
Project description
Data Workspaces is an open source framework for maintaining the state of a data science project, including data sets, intermediate data, results, and code. It supports reproducability through snapshotting and lineage models and collaboration through a push/pull model inspired by source control systems like Git.
Data Workspaces is installed as a Python 3 package and provides a Git-like command line interface and programming APIs. Specific data science tools and workflows are supported through extensions called kits. Currently, this includes Scikit-learn, TensorFlow, and Jupyter Notebooks. The goal is to provide the reproducibility and collaboration benefits with minimal changes to your current projects and processes.
Data Workspaces runs on Unix-like systems, including Linux, MacOS, and on Windows via the Windows Subsystem for Linux.
Quick Start
Please see the Quickstart Section of the documentation.
Documentation
The documentation is available here: https://data-workspaces-core.readthedocs.io/en/latest/. The source for the documentation is under docs. To build it locally, install Sphinx and run the following:
cd docs pip install -r requirements.txt # extras needed to build the docs make html
To view the local documentation, open the file docs/_build/html/index.html in your browser.
License
This code is copyright 2018 - 2020 by the Max Planck Institute for Software Systems and Data-ken Research. It is licensed under the Apache 2.0 license. See the file LICENSE.txt for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for dataworkspaces-1.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9255a9d1062519f4991d8ec15b41c4b82753fb0b0793554bb5a80b5c7c4a01d5 |
|
MD5 | adabdaf9ca90f83cfee68bfb36809166 |
|
BLAKE2b-256 | 2865304778415d6993087edcd72dcb8fa0491a8530ccb3bbb454f409f0cf5f31 |