Skip to main content

PersIst is a JupyterLab extension to enable persistent interactive visualizations in JupyterLab notebooks.

Project description

Persist

Persistent and Reusable Interactions in Computational Notebooks

Binder

This repository contains source code for Persist (PyPi) extension.

Persist is a JupyterLab extension to enable persistent interactive outputs in JupyterLab notebooks. Check out the introductory video below.

https://github.com/visdesignlab/persist/assets/14944083/c6a9347b-7c93-4d0d-9e60-e10707578327

Watch on Youtube with CC

Getting Started

Requirements

- JupyterLab >= 4.0.0 or Jupyter Notebook >= 7.0.0
- pandas >= 0.25
- altair >= 5
- ipywidgets
- anywidget

Install

To install the extension, execute:

pip install persist_ext

If the Jupyter server was already running, you might have to reload the browser page and restart the kernel.

Uninstall

To remove the extension, execute:

pip uninstall persist_ext

Example

After installing the extension, you can use the following code snippet to create an Persist-enabled interactive data table.


TODO:

  • describe a simple example to use persist.
  • link to a notebook that introduces persist and altair
  • link to the documentation

Persist and Vega-Altair charts

Persist works with Vega-Altair charts directly for the most part. Vega-Altair and Vega-Lite offer multiple ways to write a specification. However Persist has certain requirements that need to be fulfilled.

  • The selection parameters in the chart should be named. Vega-Altair's default behavior is to generate a name of selection parameter with auto-incremented numeric suffix. The value of the generated selection parameter keeps incrementing on subsequent re-executions of the cell. Persist relies on consistent names to replay the interactions, and passing the name parameter fixes allows Persist to work reliably.

  • The point selections should have at least the fields attribute specified. Vega-Altair supports selections without fields by using the auto-generated indices to define selections. The indices are generated with the default order of rows in the source dataset. Using the indices directly for selection can cause Persist to operate on incorrect rows if the source dataset order changes.

  • Dealing with datetime in Pandas is challenging. To standardize the way datetime conversion takes place within VegaLite and within Pandas when using Vega-Altair, the TimeUnit transforms and encodings must be specified in UTC. e.g month(Date) should be utcmonth(Date).

Publication

Persist is developed as part of a publication and will appear in EuroVis 2024.

Teaser image from the pre-print. The figure describes the workflow showing high level working of Persist technique.

Supplementary Material

Supplementary material including example notebooks, walkthrough notebooks, notebooks used in the study (including participant notebooks) and the analysis notebooks can be accessed here.

Abstract

Computational notebooks, such as Jupyter, support rich data visualization. However, even when visualizations in notebooks are interactive, they still are a dead end: Interactive data manipulations, such as selections, applying labels, filters, categorizations, or fixes to column or cell values, could be efficiently apply in interactive visual components, but interactive components typically cannot manipulate Python data structures. Furthermore, actions performed in interactive plots are volatile, i.e., they are lost as soon as the cell is re-run, prohibiting reusability and reproducibility. To remedy this, we introduce Persist, a family of techniques to capture and apply interaction provenance to enable persistence of interactions. When interactions manipulate data, we make the transformed data available in dataframes that can be accessed in downstream code cells. We implement our approach as a JupyterLab extension that supports tracking interactions in Vega-Altair plots and in a data table view. Persist can re-execute the interaction provenance when a notebook or a cell is re-executed enabling reproducibility and re-use.

We evaluated Persist in a user study targeting data manipulations with 11 participants skilled in Python and Pandas, comparing it to traditional code-based approaches. Participants were consistently faster with Persist, were able to correctly complete more tasks, and expressed a strong preference for Persist.

Contributing

Persist uses hatch to manage the development, build and publish workflows. You can install hatch using pipx, pip or Homebrew (on MacOS or Unix).

pipx

Install hatch globally in isolated environment. We recommend this way.

pipx install hatch
pip

Install hatch in the current Python environment.

WARNING: This may change the system Python installation.

pip install hatch
Homebrew
pip install hatch

Jupyter extensions use a custom version of yarn package manager called jlpm. When any relevant command is run, hatch should automatically install and setup up jlpm. After installing hatch with your preferred method follow instructions below for workflow you want. We prefix all commands with hatch run to ensure they are run in proper environments.

Development

Run the setup script from package.json:

hatch run jlpm setup

When setup is completed, open three terminal windows and run the follow per terminal.

Widgets

Setup vite dev server to build the widgets

hatch run watch_widgets

Extension

Start dev server to watch and build the extension

hatch run watch_extension

Lab

Run JupyterLab server with minimize flag set to false, which gives better stack traces aqnd debugging experience.

hatch run run_lab

Build

To build the extension as a standalone Python package, run:

hatch run build_extension

Publish

To publish the extension, first we create a proper version. We can run any of the following

hatch version patch # x.x.1
hatch version minor # x.1.x
hatch version major # 1.x.x

You can also append release candidate label:

hatch version rc

Finally you can directly specify the exact version:

hatch version "1.3.0"

Once the proper version is set, build the extension using the build workflow.

When the build is successful, you can publish the extension if you have proper authorization:

hatch publish

Acknowledgements

The widget architecture of Persist is created using anywidget projects.

The interactive visualizations used by Persist are based on the excellent, Vega-Lite and Vega-Altair projects. Specifially the implementation of JupyterChart class in Vega-Altair was of great help in understanding how Vega-Altair chart can be turned into a widget. We gratefully acknowledge funding from the National Science Foundation (IIS 1751238 and CNS 213756).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

persist_ext-1.6.1rc1.tar.gz (23.6 MB view details)

Uploaded Source

Built Distribution

persist_ext-1.6.1rc1-py3-none-any.whl (8.0 MB view details)

Uploaded Python 3

File details

Details for the file persist_ext-1.6.1rc1.tar.gz.

File metadata

  • Download URL: persist_ext-1.6.1rc1.tar.gz
  • Upload date:
  • Size: 23.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.27.0

File hashes

Hashes for persist_ext-1.6.1rc1.tar.gz
Algorithm Hash digest
SHA256 26b4fd8f7412054b580ee7cae69b32449ee94cd623294df8b06edc685c7736ab
MD5 db2475871f75e9b9453e4c9e4d3d8182
BLAKE2b-256 fdafe50ef4b1ef00a0cc0a6a315f7aea45f6daee60eb0b0c8fdedf8e31ef551c

See more details on using hashes here.

Provenance

File details

Details for the file persist_ext-1.6.1rc1-py3-none-any.whl.

File metadata

File hashes

Hashes for persist_ext-1.6.1rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 bd8cbfb8cea08d945dfb5f63282db64bcdf8fcc9da81f3668d691fc9d0fcd04e
MD5 99bdf4662e7b1e0a4dff110e393a82cf
BLAKE2b-256 97ce1afea8ccb8314190df363ae0fab0814ba1e5796a5b8ac999f15266acda20

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page