Skip to main content

Data science collaboration tool based on iPython notebooks.

Project description

DAGpy is a data science collaboration tool based on iPython notebooks enabling data science teams to:

  • easily collaborate by branching out of others’ notebooks

  • minimize code duplication

  • give a clean overview of the project

  • cache intermediate outputs so team members can use them without re-evaluation

  • automate the process of code execution upon data changes or on schedule

  • provide a clean interface to the data visualization dashboard designers and developers

DAGpy manages a DAG (directed acyclic graph) of blocks of code, with each block being a sequence of iPython notebook cells, together with their outputs. It is designed to work seamlessly with popular VC systems like git and can be run locally or as a server application.

GitHub: github.com/ibestvina/dagpy/.

Author: Ivan Bestvina

Example project

To play around with the example project, you can:

  • view the project DAG: dagpy view

  • run all the blocks: dagpy execute -a

  • add blocks through flows (with block B as a parent) and run them automatically: dagpy makeflow B -r

  • commit the changes: dagpy submitflow dagpy_flow.ipynb

  • explore other DAGpy options with dagpy -h

Please note that notebook execution time includes a significant overhead of over a second, because a kernel must be started for each one. In future, we plan on adding support for non-notebook plane python blocks. These would also be edited through a flow notebook view, but would be saved as .py scripts, and executed without noticable overhead.

Dependencies:

  • python 3

  • jupyter

  • dill

  • networkx, matplotlib (for DAG view)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dagpy-0.3.2.zip (17.5 kB view details)

Uploaded Source

File details

Details for the file dagpy-0.3.2.zip.

File metadata

  • Download URL: dagpy-0.3.2.zip
  • Upload date:
  • Size: 17.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for dagpy-0.3.2.zip
Algorithm Hash digest
SHA256 f22aadafb3bfb9ff4f296b3a483106e9d41458ab641b4837a8f0a42e24ee9925
MD5 28371365fdba2fcc3e6125706fd4f1cc
BLAKE2b-256 af6d8f574629b84bddc4b8ca993fed71fa157efcb1c95e3716fd259a32667d87

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page