Skip to main content

draw a dataset from inside Jupyter

Project description

drawdata

"Just draw some data and get on with your day."

This small Python library contains Jupyter widgets that allow you to draw a dataset in a Jupyter notebook. This should be very useful when teaching machine learning algorithms.

The project uses anywidget under the hood so our tools should work in Jupyter, VSCode and Colab. That also means that you get a proper widget that can interact with ipywidgets natively. Here is an example where updating a drawing triggers a new scikit-learn model to train (code).

You can really get creative with this in a notebook, so feel free to give it a spin!

Installation

Installation occurs via pip.

python -m pip install drawdata

To read the data, polars is useful, but this library also suppots pandas:

python -m pip install pandas polars

Usage

You can load the scatter widget to start drawing immediately.

from drawdata import ScatterWidget

widget = ScatterWidget()
widget

If you want to use the dataset that you've just drawn you can do so via:

# Get the drawn data as a list of dictionaries
widget.data

# Get the drawn data as a dataframe
widget.data_as_pandas
widget.data_as_polars

If you're eager to do scikit-learn stuff with your drawn data you may appreciate this property instead:

X, y = widget.data_as_X_y

The assumption for this property is that if you've used multiple colors that you're interested in doing classification and if you've only drawn one color you're interested in regression. In the case of regression y will refer to the y-axis.

Shoutout

This project was originally part of my work over at calmcode labs but my employer probabl has been very supportive and has allowed me to work on this project during my working hours. This was super cool and I wanted to make sure I recognise them for it.





Old Features

The original implementation of our widget would use an iframe to load a site in order to be able to draw from a Jupyter notebook. This works, but requires more manual effort, only works with pandas via the clipboard feature and needs an internet connection. Here's what that widget looks like:

It will be kept around, but the way forward for this library is to build on top of anywidget.

Old Feature Usage

When you run this from jupyter, you should load in an iframe.

from drawdata import draw_scatter

draw_scatter()

Once you're done drawing you can copy the data to the clipboard. After this you can use pandas to read the clipboard to get your drawn data into a dataframe.

import pandas as pd 
pd.read_clipboard(sep=",")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

drawdata-0.3.5.tar.gz (233.2 kB view details)

Uploaded Source

Built Distribution

drawdata-0.3.5-py2.py3-none-any.whl (234.4 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file drawdata-0.3.5.tar.gz.

File metadata

  • Download URL: drawdata-0.3.5.tar.gz
  • Upload date:
  • Size: 233.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for drawdata-0.3.5.tar.gz
Algorithm Hash digest
SHA256 5c2598850d3814f0954acc89c3517c1243d7e642e9624f70c3c9b83bb1c19e1c
MD5 4b284f369db7883c33f4889541f8a3c6
BLAKE2b-256 26e06bc60b63f7c2af0c9719c56be02d31b29e0f1bd1efbeb5d8b37c0ede5dd4

See more details on using hashes here.

File details

Details for the file drawdata-0.3.5-py2.py3-none-any.whl.

File metadata

  • Download URL: drawdata-0.3.5-py2.py3-none-any.whl
  • Upload date:
  • Size: 234.4 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for drawdata-0.3.5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 2d19b51ff3a9163da43c0cdb1c6aabd87f190fbb992e439e968cc7e572e6a35b
MD5 ce4e334a23aadd5c7ae33ae49b1df577
BLAKE2b-256 0fcb59e6b592e72955b872561a219eaea0cbb66a14c147d5f0d7221e9e532c7f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page