Skip to main content

An library used within Acryl Agents to execute tasks

Project description

Acryl Executor

Remote execution agent used for running DataHub tasks, such as ingestion powered through the UI.

python3 -m venv --upgrade-deps venv
source venv/bin/activate
pip3 install .

Notes

By default, this library comes with a set of default task implementations:

RUN_INGEST Task

  • SubprocessProcessIngestionTask - Executes a metadata ingestion run by spinning off a subprocess. Supports ingesting from a particular version, and with a specific plugin (based on the platform type requested)

  • InMemoryIngestionTask - Executes a metadata ingestion run using the datahub library in the same process. Not great for production where we can see strange dependency conflicts when certain packages are executed together. Use this for testing, as it has no ability to check out a specific DataHub Metadata Ingestion plugin.

Cloud Logging (S3)

There is one implementation of a cloud logging client that writes logs to S3. This is used by the Acryl Executor to write logs to S3. To enable it you should set the following environment variables: ENV DATAHUB_CLOUD_LOG_BUCKET - The S3 bucket to write logs to ENV DATAHUB_CLOUD_LOG_PATH - The S3 path to write logs to

The logs are compressed with tar and gzipped before being uploaded to S3 to the following path: s3://CLOUD_LOG_BUCKET/CLOUD_LOG_PATH/<pipeline_id>/year=/month=/day=/<run_id>/

Local file cleanup after upload

When S3 upload is enabled, local log and artifact files can optionally be removed after a successful upload and replaced by a <filename>.s3 sentinel file containing:

{
  "s3_uri": "s3://bucket/path/executor-logs/executor-logs.tgz",
  "uploaded_at": "2024-01-15T10:30:00.000000+00:00",
  "original_size_bytes": 1234567
}

To enable cleanup, set DATAHUB_CLOUD_LOG_CLEANUP=true. Cleanup only takes effect when DATAHUB_CLOUD_LOG_BUCKET is set (no bucket → no upload → no cleanup) and only after each individual archive upload succeeds. A failed upload leaves the original file untouched.

Note: DATAHUB_CLOUD_LOG_CLEANUP requires DATAHUB_CLOUD_LOG_BUCKET to be set.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

acryl_executor-0.3.14.tar.gz (53.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

acryl_executor-0.3.14-py3-none-any.whl (75.7 kB view details)

Uploaded Python 3

File details

Details for the file acryl_executor-0.3.14.tar.gz.

File metadata

  • Download URL: acryl_executor-0.3.14.tar.gz
  • Upload date:
  • Size: 53.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for acryl_executor-0.3.14.tar.gz
Algorithm Hash digest
SHA256 d6a63b2e28bb7890ed559740bd8f38ad07a8981339ec81cf65c8500603d284a5
MD5 e1454942f492a50ea397aa7a59e9bf12
BLAKE2b-256 9740f5b8cbbc10e42f7afb0657842bca79061988cf50d75dbc7db2c06da1f6f1

See more details on using hashes here.

File details

Details for the file acryl_executor-0.3.14-py3-none-any.whl.

File metadata

File hashes

Hashes for acryl_executor-0.3.14-py3-none-any.whl
Algorithm Hash digest
SHA256 52d80653a2473685b9c6624da36acca4d8d916dd76d340d8c9ea665e34e8e6f9
MD5 5bc75b7d6f8ccbdad6f7c99e2178651b
BLAKE2b-256 b0bf30c83fa5242655a698201fd41e928356de47797f588dbbe3d468d489f28f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page