Skip to main content

Crowdom

Project description

Crowdom

Crowdom is a tool for simplifying data labeling.

Write plain Python code and launch data labeling without knowledge of crowdsourcing and underlying platform (Crowdom uses Toloka as a platform for publishing tasks for workers). Define task you solve and load source data with few lines of code, choose quality-cost-speed tradeoff in interactive UI form, launch data labeling, study result labeling in Pandas dataframes.

Crowdom uses ʎzy, cloud workflow runtime, to run data labeling workflow. This provides reliability (automatic errors retry, possibility of data labeling relaunch without losing progress) and out-of-the-box data persistence.

Quickstart

We recommend you to look first at image classification example, since it demonstrates full data labeling workflow, proposed in Crowdom, with detailed explanations for each step.

In other examples, you can see how working with data labeling looks like for different types of tasks with use of Crowdom.

To get the benefits of running on ʎzy, see ʎzy setup example.

Join our Telegram chat if you want to learn more about the Crowdom or discuss your task with us.

Types of tasks

Tasks in Crowdom are divided into two types:

  • Classification tasks, which have a fixed set of labels as output.
  • Annotation tasks, for which output has "unlimited" dimension.

In a typical classification task, worker is proposed to make a choice of one of the pre-determined options. Side-by-side (SbS) comparison is a special case of classification task.

As for annotation task, there may be many potential solutions, and there may be more than one correct one. Speech transcription, image annotation are examples of annotation tasks.

Examples

The following table contains list of examples, which demonstrates data labeling for different types of tasks, as well as other aspects of data labeling workflow.

Examples are presented as .ipynb files, located in this repository, but displayed by nbviewer, which do it more precisely than GitHub.

Image classification and audio transcript) examples also have .html versions. These examples present full labeling workflow, corresponding two classification and annotation types of tasks respectively. .html allows to collapse optional sections in notebook to simplify understanding of main steps of workflow, as well as to display interactive widgets contents (for example, to display quality-cost-speed tradeoff interactive form).

Example Full workflow Function Data types Additionally
Image classification (HTML) Classification Image
Audio transcript (HTML) Annotation Audio, Text
Audio transcripts SbS SbS Audio, Text
Voice recording Annotation Text, Audio Media output, checking annotations by the ML model
Audio transcript, extended Annotation Text, Audio Custom task UI, custom task duration calculation, first annotations attempts by the ML model
MOS Classification Audio MOS algorithm usage example
Audio questions Classification Audio Output label set depending on the input data
Experts registration Registration of your private expert workforce
Task update Task update (instructions, UI and etc.)
ʎzy usage ʎzy setup, parallel labelings

Communication

Join our communities if you have questions about Crowdom or want to discuss your data labeling task.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crowdom-0.2.0.tar.gz (185.2 kB view details)

Uploaded Source

Built Distribution

crowdom-0.2.0-py3-none-any.whl (164.5 kB view details)

Uploaded Python 3

File details

Details for the file crowdom-0.2.0.tar.gz.

File metadata

  • Download URL: crowdom-0.2.0.tar.gz
  • Upload date:
  • Size: 185.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for crowdom-0.2.0.tar.gz
Algorithm Hash digest
SHA256 08374f309dcd3ba756b9b7e744fb88a8e442b35e19dd38e79dd776bf3157f12c
MD5 151d41f4499d74380337b83472c3860c
BLAKE2b-256 90f9032909ce3f4867e8671f9722abe79c2f17f7e334796c2b2b943584c913ac

See more details on using hashes here.

File details

Details for the file crowdom-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: crowdom-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 164.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.6

File hashes

Hashes for crowdom-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f4c871481e1659c4b1258c2cca37196c32832292b13998d5d69ca13870a2c4e8
MD5 e5bf219ac89e16fda2f4f7593e228af6
BLAKE2b-256 7ffc5d984a79d2e486d10fe16aa07fdafd2033aaeae936960ff18bd6eda75288

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page