Skip to main content

An library used within Acryl Agents to execute tasks

Project description

Acryl Executor

Remote execution agent used for running DataHub tasks, such as ingestion powered through the UI.

python3 -m venv --upgrade-deps venv
source venv/bin/activate
pip3 install .

Notes

By default, this library comes with a set of default task implementations:

RUN_INGEST Task

  • SubprocessProcessIngestionTask - Executes a metadata ingestion run by spinning off a subprocess. Supports ingesting from a particular version, and with a specific plugin (based on the platform type requested)

  • InMemoryIngestionTask - Executes a metadata ingestion run using the datahub library in the same process. Not great for production where we can see strange dependency conflicts when certain packages are executed together. Use this for testing, as it has no ability to check out a specific DataHub Metadata Ingestion plugin.

Cloud Logging (S3)

There is one implementation of a cloud logging client that writes logs to S3. This is used by the Acryl Executor to write logs to S3. To enable it you should set the following environment variables: ENV DATAHUB_CLOUD_LOG_BUCKET - The S3 bucket to write logs to ENV DATAHUB_CLOUD_LOG_PATH - The S3 path to write logs to

The logs are compressed with tar and gzipped before being uploaded to S3 to the following path: s3://CLOUD_LOG_BUCKET/CLOUD_LOG_PATH/<pipeline_id>/year=/month=/day=/<run_id>/

Local file cleanup after upload

When S3 upload is enabled, local log and artifact files can optionally be removed after a successful upload and replaced by a <filename>.s3 sentinel file containing:

{
  "s3_uri": "s3://bucket/path/executor-logs/executor-logs.tgz",
  "uploaded_at": "2024-01-15T10:30:00.000000+00:00",
  "original_size_bytes": 1234567
}

To enable cleanup, set DATAHUB_CLOUD_LOG_CLEANUP=true. Cleanup only takes effect when DATAHUB_CLOUD_LOG_BUCKET is set (no bucket → no upload → no cleanup) and only after each individual archive upload succeeds. A failed upload leaves the original file untouched.

Note: DATAHUB_CLOUD_LOG_CLEANUP requires DATAHUB_CLOUD_LOG_BUCKET to be set.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

acryl_executor-0.3.16.tar.gz (54.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

acryl_executor-0.3.16-py3-none-any.whl (76.2 kB view details)

Uploaded Python 3

File details

Details for the file acryl_executor-0.3.16.tar.gz.

File metadata

  • Download URL: acryl_executor-0.3.16.tar.gz
  • Upload date:
  • Size: 54.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for acryl_executor-0.3.16.tar.gz
Algorithm Hash digest
SHA256 cb91fb9a5bd283a588c66498a7bf5274564b76dc70733b6cc439aefadc86261f
MD5 4e1d0fe0d69f9c34ee072d20789ac35b
BLAKE2b-256 ec69927146c17b52b2b02c279d559c3baa42840ee0136ee3948a6ef0969d6878

See more details on using hashes here.

File details

Details for the file acryl_executor-0.3.16-py3-none-any.whl.

File metadata

File hashes

Hashes for acryl_executor-0.3.16-py3-none-any.whl
Algorithm Hash digest
SHA256 f3bca68d8dd94ea026e2ba83ebcf1e5e2497ce2fb1cc4ebfe53b07a926330b4a
MD5 56dde3dd115d0979df2a33c517da1921
BLAKE2b-256 f48e3989601e93595f3672c6a50d5deffe006159e065fc58277a41ae57bf8857

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page