Skip to main content

Easy management of source data, intermediate data, and results for data science projects

Project description

Data Workspaces is an open source framework for maintaining the state of a data science project, including data sets, intermediate data, results, and code. It supports reproducability through snapshotting and lineage models and collaboration through a push/pull model inspired by source control systems like Git.

Data Workspaces is installed as a Python 3 package and provides a Git-like command line interface and programming APIs. Specific data science tools and workflows are supported through extensions called kits. Currently, this includes Scikit-learn, TensorFlow, and Jupyter Notebooks. The goal is to provide the reproducibility and collaboration benefits with minimal changes to your current projects and processes.

Data Workspaces runs on Unix-like systems, including Linux, MacOS, and on Windows via the Windows Subsystem for Linux.

https://travis-ci.org/data-workspaces/data-workspaces-core.svg?branch=master

Quick Start

Please see the Quickstart Section of the documentation.

Documentation

The documentation is available here: https://data-workspaces-core.readthedocs.io/en/latest/. The source for the documentation is under docs. To build it locally, install Sphinx and run the following:

cd docs
pip install -r requirements.txt # extras needed to build the docs
make html

To view the local documentation, open the file docs/_build/html/index.html in your browser.

License

This code is copyright 2018 - 2020 by the Max Planck Institute for Software Systems and Data-ken Research. It is licensed under the Apache 2.0 license. See the file LICENSE.txt for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataworkspaces-1.2.2.tar.gz (127.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dataworkspaces-1.2.2-py3-none-any.whl (157.1 kB view details)

Uploaded Python 3

File details

Details for the file dataworkspaces-1.2.2.tar.gz.

File metadata

  • Download URL: dataworkspaces-1.2.2.tar.gz
  • Upload date:
  • Size: 127.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.7.3 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for dataworkspaces-1.2.2.tar.gz
Algorithm Hash digest
SHA256 3f0f2e5f276b4aa9a9a69bbae42963b1b1c2b19a0eace73d4d01046af547767a
MD5 548fbb39c3eac66d7b3a9e697d5d37fd
BLAKE2b-256 9ed795b563ffff7072597a91b390fda0b33066c45256471485209ce61df583dd

See more details on using hashes here.

File details

Details for the file dataworkspaces-1.2.2-py3-none-any.whl.

File metadata

  • Download URL: dataworkspaces-1.2.2-py3-none-any.whl
  • Upload date:
  • Size: 157.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.7.3 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.2

File hashes

Hashes for dataworkspaces-1.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b1dc863ba8c40cac84c46f25eb2720818e0015134acd469bbb249b353a1400a6
MD5 b6644eeb6c734891b8b45d9d7dc4d76b
BLAKE2b-256 39ac0e6f9fae9c3db2131918a68ab6f551eef34cec157e9b7949821d68a9a1f8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page