Skip to main content

Apache Airflow provider for Yandex Realty Partner API — collect call statistics

Project description

airflow-provider-yandex-realty

Apache Airflow provider for the Yandex Realty Partner API — collect call statistics from Яндекс Недвижимость.


Powered by Claude Code


Installation

pip install airflow-provider-yandex-realty

Requires Python 3.10+ and apache-airflow>=2.9.1.

Connection

Create an Airflow connection of type HTTP (conn_id = yandex_realty_default by default).

  • Password — the OAuth token, stored without the OAuth prefix (the code adds it when building the Authorization: OAuth {token} header).
  • Extra — partner key, agency id, and account(s).

Multiple accounts

{
  "x_authorization": "Vertis public-partner-...",
  "agency_id": "417938",
  "accounts": [
    {"client_id": "103575674"},
    {"client_id": "67890"}
  ]
}

Single account

{
  "x_authorization": "Vertis public-partner-...",
  "agency_id": "417938",
  "client_id": "103575674"
}

Which account a task processes is decided by the operator's account_id parameter — it is a selector into the connection, not a credential:

  • single-account — omit account_id (leave it None); the client_id is read from the connection's top-level client_id.
  • multi-account — you do not hardcode it. The DAG enumerates accounts from the connection with list_accounts() and passes each account_id automatically (it equals the sanitized client_id).

Usage

YandexRealtyCallsOperator collects calls for a date range, groups them by day, and writes one file per day. One task processes one account.

Single account

Point the operator at the connection — nothing else is needed; the account is read from the connection's top-level client_id.

from airflow_provider_yandex_realty.operators.calls import YandexRealtyCallsOperator

collect = YandexRealtyCallsOperator(
    task_id="collect_calls",
    conn_id="yandex_realty_default",
    date_from="{{ ds }}",
    date_to="{{ ds }}",
    base_dir="/tmp/yandex_realty",
    output_format="json",      # "json" (NDJSON) or "csv"
    add_snapshot_ts=True,      # inject the DAG-run start timestamp (JSON only)
)

Multiple accounts

Don't hardcode account_id. Read the accounts from the connection with list_accounts() and fan out one task per account — account_id comes from the connection, not by hand:

from airflow_provider_yandex_realty.accounts import list_accounts
from airflow_provider_yandex_realty.operators.calls import YandexRealtyCallsOperator

for account in list_accounts("yandex_realty_default"):
    YandexRealtyCallsOperator(
        task_id=f"collect_{account.id}",
        conn_id="yandex_realty_default",
        account_id=account.id,     # selector into extra.accounts, taken from the connection
        date_from="{{ ds }}",
        date_to="{{ ds }}",
    )

date_from, date_to and conn_id are templated. execute returns a list of {date, path, snapshot_ts} entries (one per day written).

For a full multi-account example (collect → GCS → BigQuery + S3), see examples/bq_and_s3_multi_account_dag.py.

Output

  • Layout{base_dir}/{account_id}/{safe_run_id}/{date}.{ext}. The account_id segment is omitted when account_id is None (single-account). Each path component is sanitized ([^\w-]_) so untrusted values cannot escape base_dir. ext is json or csv.
  • JSON — newline-delimited JSON (NDJSON): one JSON object per line.
  • CSV — a header row plus one row per call, all fields quoted.
  • Columns — the 12 canonical CALL_FIELDS: call_datetime, date, object_name, incoming_phone, internal_phone, wait_duration, call_duration, revenue, object_type, campaign_tariff, client_tariff, is_targeted.
  • snapshot_ts — added to each record only when add_snapshot_ts=True, and only in JSON output; the CSV schema never gains a snapshot_ts column.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

airflow_provider_yandex_realty-0.1.0.tar.gz (35.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

airflow_provider_yandex_realty-0.1.0-py3-none-any.whl (14.8 kB view details)

Uploaded Python 3

File details

Details for the file airflow_provider_yandex_realty-0.1.0.tar.gz.

File metadata

File hashes

Hashes for airflow_provider_yandex_realty-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a08f37e37232b5beb609c01a63d1a8e58937f93c2d519edb9c6c9ef09fa7ae94
MD5 f51ac80c88d72ac81aca1a697529bc79
BLAKE2b-256 dc313a047dadd297c3653c5dabdccab162fe96a92c83aa371b4cbe8b6d66b2cf

See more details on using hashes here.

Provenance

The following attestation bundles were made for airflow_provider_yandex_realty-0.1.0.tar.gz:

Publisher: publish.yml on mkozhin/airflow-provider-yandex-realty

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file airflow_provider_yandex_realty-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for airflow_provider_yandex_realty-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ba84f6617358f44a5a8bf4bda44b594abfcada5b3dead56d86a4e26ce2773598
MD5 80114bcb05c242684373862876d44a28
BLAKE2b-256 620cf3212ae04095e3eb15eb456e37cc42f9b0100179196fe1db68c1a43cbb5d

See more details on using hashes here.

Provenance

The following attestation bundles were made for airflow_provider_yandex_realty-0.1.0-py3-none-any.whl:

Publisher: publish.yml on mkozhin/airflow-provider-yandex-realty

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page