Skip to main content

A Toloka provider for Apache Airflow

Project description

Airflow Toloka Provider

GitHub Tests Codecov

This library allows you to run crowdsourcing Toloka processes in Apache Airflow - a widely used workflow management system

Here you can find a collection of ready-made Airflow tasks for the most frequently used actions in Toloka-Kit.

Getting started

$ pip install airflow-provider-toloka

A good way to start is to follow the example in this repo.

TolokaHook

TolokaHook is used for getting toloka OAuth token and creating TolokaClient with it. You can get TolokaClient from TolokaHook by calling get_conn() method.

To make an appropriate Airflow Connection you need to create it in the Airflow Connections UI with following parameters:

  • Conn ID: toloka_default
  • Conn Type: Toloka
  • Token: enter your OAuth token for Toloka. You can learn more about how to get it here.
  • Environment: enter production or sandbox

Tasks use the toloka_default connection id by default, but if needed, you can create additional Airflow Connections and reference them as the function toloka_conn_id argument.

Tasks and Sensors

There are several tasks and sensors that give you easy way to interact with Toloka from Airflow DAGs. Creating a project and a pool, adding tasks and getting assignments are among them. You can easily create your own task using TolokaHook if it is beyond the scope of implemented ones. And it would be nice to have your pull request with updates.

Check out our example to see tasks and sensors in the battlefield.

Useful Links

Questions and bug reports

License

© YANDEX LLC, 2022. Licensed under the Apache License, Version 2.0. See LICENSE file for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airflow-provider-toloka-0.0.8.tar.gz (9.0 kB view details)

Uploaded Source

Built Distribution

airflow_provider_toloka-0.0.8-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file airflow-provider-toloka-0.0.8.tar.gz.

File metadata

File hashes

Hashes for airflow-provider-toloka-0.0.8.tar.gz
Algorithm Hash digest
SHA256 a22e5c159fe7f4411f5ed358e2a9c3acb930ceeb63d0c3955c1199d3603a2b4b
MD5 a13cdea4fb3efc167ca291ec964e29b8
BLAKE2b-256 66c22aa5469803870b07c8bcd80047eab12e64967e9cf6e3cfe21c361b8012f8

See more details on using hashes here.

Provenance

File details

Details for the file airflow_provider_toloka-0.0.8-py3-none-any.whl.

File metadata

File hashes

Hashes for airflow_provider_toloka-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 c9337eb7d991b77fb101bc25ce1025d3619492ef1a0418ece579eaf35c077d0e
MD5 7ec0251319d01d536483c0ce7d30719f
BLAKE2b-256 66a8ad09b84ad78b08b5c886272eaef48bc55340f82a9921cd025f4da8284a82

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page