Skip to main content

An library used within Acryl Agents to execute tasks

Project description

Acryl Executor

Remote execution agent used for running DataHub tasks, such as ingestion powered through the UI.

python3 -m venv --upgrade-deps venv
source venv/bin/activate
pip3 install .

Notes

By default, this library comes with a set of default task implementations:

RUN_INGEST Task

  • SubprocessProcessIngestionTask - Executes a metadata ingestion run by spinning off a subprocess. Supports ingesting from a particular version, and with a specific plugin (based on the platform type requested)

  • InMemoryIngestionTask - Executes a metadata ingestion run using the datahub library in the same process. Not great for production where we can see strange dependency conflicts when certain packages are executed together. Use this for testing, as it has no ability to check out a specific DataHub Metadata Ingestion plugin.

Cloud Logging (S3)

There is one implementation of a cloud logging client that writes logs to S3. This is used by the Acryl Executor to write logs to S3. To enable it you should set the following environment variables: ENV DATAHUB_CLOUD_LOG_BUCKET - The S3 bucket to write logs to ENV DATAHUB_CLOUD_LOG_PATH - The S3 path to write logs to

The logs are compressed with tar and gzipped before being uploaded to S3 to the following path: s3://CLOUD_LOG_BUCKET/CLOUD_LOG_PATH/<pipeline_id>/year=/month=/day=/<run_id>/

Local file cleanup after upload

When S3 upload is enabled, local log and artifact files can optionally be removed after a successful upload and replaced by a <filename>.s3 sentinel file containing:

{
  "s3_uri": "s3://bucket/path/executor-logs/executor-logs.tgz",
  "uploaded_at": "2024-01-15T10:30:00.000000+00:00",
  "original_size_bytes": 1234567
}

To enable cleanup, set DATAHUB_CLOUD_LOG_CLEANUP=true. Cleanup only takes effect when DATAHUB_CLOUD_LOG_BUCKET is set (no bucket → no upload → no cleanup) and only after each individual archive upload succeeds. A failed upload leaves the original file untouched.

Note: DATAHUB_CLOUD_LOG_CLEANUP requires DATAHUB_CLOUD_LOG_BUCKET to be set.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

acryl_executor-0.3.15.tar.gz (53.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

acryl_executor-0.3.15-py3-none-any.whl (75.8 kB view details)

Uploaded Python 3

File details

Details for the file acryl_executor-0.3.15.tar.gz.

File metadata

  • Download URL: acryl_executor-0.3.15.tar.gz
  • Upload date:
  • Size: 53.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for acryl_executor-0.3.15.tar.gz
Algorithm Hash digest
SHA256 51391e85e0533ad532acf3e9a223c3037ac34db778b9b34cd674a4f7d0916ade
MD5 042e8790100c5e21a3d9bcf7f99ea0cf
BLAKE2b-256 518af9dddaf8bcad57d1efc2b6161b91fb9c922592853bcd2cb76407e2c66076

See more details on using hashes here.

File details

Details for the file acryl_executor-0.3.15-py3-none-any.whl.

File metadata

File hashes

Hashes for acryl_executor-0.3.15-py3-none-any.whl
Algorithm Hash digest
SHA256 ce36996bdffeded544063aecccbb57fe8db6bfcea851f1bb0f14f10eb9b907c1
MD5 f412e9b120d34aab09ecc5090cc5c695
BLAKE2b-256 33efaf1374f12482d092301c58bc375ce058f17d056376f12276de9a39663541

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page