Skip to main content

An extensible ML workflow framework built for data scientists and ML engineers.

Project description

Logo

Graphbook

The ML workflow framework
Report bug · Request feature

OverviewCurrent FeaturesGetting StartedCollaboration

Overview

Graphbook is a framework for building efficient, visual DAG-structured ML workflows composed of nodes written in Python. Graphbook provides common ML processing features such as multiprocessing IO and automatic batching, and it features a web-based UI to assemble, monitor, and execute data processing workflows. It can be used to prepare training data for custom ML models, experiment with custom trained or off-the-shelf models, and to build ML-based ETL applications. Custom nodes can be built in Python, and Graphbook will behave like a framework and call lifecycle methods on those nodes.

Current Features

  • ​​Graph-based visual editor to experiment and create complex ML workflows
  • Caches outputs and only re-executes parts of the workflow that changes between executions
  • UI monitoring components for logs and outputs per node
  • Custom buildable nodes with Python
  • Automatic batching for Pytorch tensors
  • Multiprocessing I/O to and from disk and network
  • Customizable multiprocessing functions
  • Ability to execute entire graphs, or individual subgraphs/nodes
  • Ability to execute singular batches of data
  • Ability to pause graph execution
  • Basic nodes for filtering, loading, and saving outputs
  • Node grouping and subflows
  • Autosaving and shareable serialized workflow files
  • Registers node code changes without needing a restart
  • Monitorable CPU and GPU resource usage

Getting Started

Install from PyPI

  1. pip install graphbook
  2. graphbook
  3. Visit http://localhost:8007

Install with Docker

  1. Pull and run the downloaded image
    docker run --rm -p 8005:8005 -p 8006:8006 -p 8007:8007 -v $PWD/workflows:/app/workflows rsamf/graphbook:latest
    
  2. Visit http://localhost:8007

Visit the docs to learn more on how to create custom nodes and workflows with Graphbook.

Collaboration

This is a guide on how to get started developing Graphbook. If you are simply using Graphbook, view the Getting Started section.

Run Graphbook in Development Mode

You can use any other virtual environment solution, but poetry is used in the steps below.

  1. Clone the repo and cd graphbook
  2. poetry install --with dev
  3. poetry shell
  4. python graphbook/server.py
  5. cd web
  6. npm install
  7. npm run dev

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphbook-0.4.0.tar.gz (812.0 kB view hashes)

Uploaded Source

Built Distribution

graphbook-0.4.0-py3-none-any.whl (818.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page