Easy management of source data, intermediate data, and results for data science projects
Project description
Data Workspaces is an open source framework for maintaining the state of a data science project, including data sets, intermediate data, results, and code. It supports reproducability through snapshotting and lineage models and collaboration through a push/pull model inspired by source control systems like Git.
Data Workspaces is installed as a Python 3 package and provides a Git-like command line interface and programming APIs. Specific data science tools and workflows are supported through extensions called kits. Currently, this includes Scikit-learn, TensorFlow, and Jupyter Notebooks. The goal is to provide the reproducibility and collaboration benefits with minimal changes to your current projects and processes.
Data Workspaces runs on Unix-like systems, including Linux, MacOS, and on Windows via the Windows Subsystem for Linux.
Quick Start
Please see the Quickstart Section of the documentation.
Documentation
The documentation is available here: https://data-workspaces-core.readthedocs.io/en/latest/. The source for the documentation is under docs. To build it locally, install Sphinx and run the following:
cd docs pip install -r requirements.txt # extras needed to build the docs make html
To view the local documentation, open the file docs/_build/html/index.html in your browser.
License
This code is copyright 2018 - 2021 by the Max Planck Institute for Software Systems and Benedat LLC. It is licensed under the Apache 2.0 license. See the file LICENSE.txt for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dataworkspaces-1.6.0.tar.gz
.
File metadata
- Download URL: dataworkspaces-1.6.0.tar.gz
- Upload date:
- Size: 154.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/24.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.31.1 importlib-metadata/4.8.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.7.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e6057958784abaab0dacfbd6314a8484e8a7e2f678a6fe08e577052f343da76c |
|
MD5 | 258e675da6ede21084fc9b1053ab6a6b |
|
BLAKE2b-256 | 74ab5eefe538b100ccebe97838d2c8eeda120332c4efbd30bd31a56b96fb64af |
File details
Details for the file dataworkspaces-1.6.0-py3-none-any.whl
.
File metadata
- Download URL: dataworkspaces-1.6.0-py3-none-any.whl
- Upload date:
- Size: 184.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/24.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.31.1 importlib-metadata/4.8.2 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.7.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1e9266388725313a135b4ed9bdcea93c7e4e1e9910d812799b18303ae62e6a9f |
|
MD5 | d2363a4515694225dfd701f3ccb9e8db |
|
BLAKE2b-256 | d359929cb77053d7f38fef9fa29d26b3bde491227377adc48ec133ef92fe028e |