Skip to main content

openclean Notebook UI Package

Project description

https://img.shields.io/pypi/pyversions/openclean-notebook.svg https://badge.fury.io/py/openclean-notebook.svg https://img.shields.io/badge/License-BSD-green.svg https://github.com/VIDA-NYU/openclean-notebook/actions/workflows/build.yml/badge.svg Documentation Status https://codecov.io/gh/VIDA-NYU/openclean-notebook/branch/master/graph/badge.svg?token=7YRZIGOR1J
openclean Logo

About

This package provides a graphical user interface for openclean that can be used to visualize and manipulate datasets in notebook environments like Jupyter Notebooks.

Installation

The package can be installed using pip.

pip install openclean-notebook

You can use the additional [jupyter] option to install the Python Jupyter package if you want to use the UI within a Jupyter Notebook.

pip install openclean-notebook[jupyter]

The notebook UI is a JavaScript bundle that is included in the installed package.

Usage

To use the notebook UI, an instance of the openclean_notebook.engine.OpencleanAPI is required. The API engine provides a namespace that manages a set of datasets that are identified by unique names. The engine is associated with an object repository that provides additional functionality to register objects like functions, lookup tables, etc.. The engine is also responsible for coordinating the communication with the JavaScript UI.

A helper function to create an instance of the openclean API is included in the openclean_notebook package. For example:

from openclean_notebook import DB
db = DB(basedir='.openclean', create=True)

In this example a new instance of the API engine is created that stores all dataset files in a local folder .openclean. The create=True flag ensures that a fresh instance is created every time the code (cell) is run.

The next step is to create a new dataset in the API, e.g., from a given data frame or data file. Each dataset has to have a unique name.

db.load_dataset(source=source='./data/bre9-aqqr.tsv.gz', name='covid-cases')

You can then either view and edit the full dataset using the notebook UI or (e.g., for performance reasons) a sample of the dataset. The recipe that is created from the interactions in the notebook UI can later be applied on the full dataset. In the example below we use a sample of 100 rows for display in the notebook UI.

db.edit('covid-cases', n=100)

For a full example please have a look at the example notebook that also shows how to register and run commands on the dataset.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openclean-notebook-0.1.7.tar.gz (1.4 MB view details)

Uploaded Source

Built Distribution

openclean_notebook-0.1.7-py3-none-any.whl (1.4 MB view details)

Uploaded Python 3

File details

Details for the file openclean-notebook-0.1.7.tar.gz.

File metadata

  • Download URL: openclean-notebook-0.1.7.tar.gz
  • Upload date:
  • Size: 1.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.9.6

File hashes

Hashes for openclean-notebook-0.1.7.tar.gz
Algorithm Hash digest
SHA256 1e4ad317e470fbb002011f653ae55480f8af2d4263e8323ad51b376d3f4bcde8
MD5 64e8fc030902ca98aa7d2462d6d8f2ff
BLAKE2b-256 64b10438d413b44b54d3869353a53902a3a5e992741fde8d05eb090d76ff9ce6

See more details on using hashes here.

File details

Details for the file openclean_notebook-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: openclean_notebook-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.4 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.1 CPython/3.9.6

File hashes

Hashes for openclean_notebook-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 c6acb4dfa0f3de532dd42ddf8b8d3b3d58507ccbad5f34adb60338e90d86260c
MD5 1dec0af8fc751489ac1c5a03fc71106b
BLAKE2b-256 6c70415496dfce1934e92bc9a8eb9de5e6202670b10b9d31673850760b4286a3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page