Label data at scale. Fun and precision included.
Project description
Explore and label on a map of raw data.
Get enough to feed your model in no time.
hover
speeds up data labeling through embedding + visualization + callbacks
.
- You just need raw data and an embedding to start.
:sparkles: Features
It's fast because it labels in bulk.
:telescope: A 2D-embedded view of your dataset for labeling, equipped with
- Tooltip for each point and table view for groups of points.
- Search widgets for ad-hoc highlight of data matching search criteria.
- Toggle buttons that clearly distinguish data subsets ("raw"/"train"/"dev"/"test").
It's accurate because you can filter and extend.
:microscope: Supplementary views to provide further labeling precision, such as
- Advanced search view which can filter points by search criteria and provides stronger highlight.
- Active learning view which puts a model in the loop and can filter by confidence score.
- Function-based view which can leverage custom functions for labeling and filtering.
It's fun because the process never gets old.
- Explore the map to find out which "zones" are easy and which ones are tricky.
- Join the conquest of your data by coloring all of those zones through wisdom!
Check out @phurwicz/hover-binder for a list of demo apps.
:rocket: Quickstart
Code + Walkthrough -> Labeling App
- edit & run code right in your browser, with guides along the way.
Jump to Labeling App
- interactive plot for labeling data, pre-built and hosted on Binder.
:package: Install
Python: 3.7+
OS: Linux & Mac & Windows
PyPI (for all releases): pip install hover
Conda-forge (for 0.6.0 and above): conda install -c conda-forge hover
For Windows users, we recommend Windows Subsystem for Linux.
- On Windows itself you will need C++ build tools for dependencies.
:book: Resources
:flags: Project News
- Feb 25, 2022 version 0.7.0 is now available. Check out the changelog for details :partying_face:. Some tl-dr for the impatient:
- audio and image support supply audio/image files through URLs to label with
hover
!- any type supported by HTML (and your browser) will be supported here.
- high-dimensional support you can now use higher-than-2D embeddings.
hover
still plots in 2D, but you can dynamically choose which two dimension to use.
- audio and image support supply audio/image files through URLs to label with
:bell: Remarks
Shoutouts
- Thanks to
Bokeh
becausehover
would not exist without linked plots and callbacks, or be nearly as good without embeddable server apps. - Thanks to Philip Vollet for sharing
hover
with the community even when it was really green.
Contributing
- All feedbacks are welcome, especially what you find lacking and want it fixed!
./requirements-dev.txt
lists required packages for development.- Pull requests are advised to use a superset of the pre-commit hooks listed in .pre-commit-config.yaml.
Citation
If you have found hover
useful to your work, please let us know :hugs:
@misc{hover,
title={{hover}: label data at scale},
url={https://github.com/phurwicz/hover},
note={Open software from https://github.com/phurwicz/hover},
author={
Pavel Hurwicz and
Haochuan Wei},
year={2021},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.