Taurus Data Jobs
Reason this release was yanked:
accidentally uploaded
Project description
The Taurus Data Jobs internal and Control Plane API. Data Jobs allows Data Engineers to implement automated pull ingestion (E in ELT) and batch data transformation into Data Warehouse (T in ELT). See also https://confluence.eng.vmware.com/display/SUPCR/Data+Pipelines+User+Guide The API has resource-oriented URLs, JSON-encoded responses, and uses standard HTTP response codes, authentication, and verbs. The API enables creating, deploying, managing and executing Data Jobs in runtime environment.<br> ![](https://confluence.eng.vmware.com/rest/gliffy/1.0/embeddedDiagrams/c0a4fed2-0229-4baa-b192-c8ae6ad6ad28.png) <br> The API reflects the usual Data Job Development lifecycle:<br> <li> Create a new data job (webhook to further configure job -e.g authorize its createion, setup permissions, etc). <li> Download keytab. Develop and run the data job locally. <li> Deploy data job in cloud runtime enviornment to run on scheduled basis. <br><br> If Authentication is required pass OAuth2 access token in http header 'Authorization: Bearer [access-token-here]' The API promotes some best practices (inspired by https://12factor.net ): <li> Explicitly declare and isolate dependencies <li> Strict separation of config from code. Config varies substantially across deploys, code does not. <li> Separation between the build, release/deploy, and run stages <li> Data Jobs are stateless and share-nothing processes. Any data that needs to persist must be stored in a stateful backing service (e.g IProperties). <li> Implementation is assumed to be be atomic and idempotent - should be OK for a job to fail somewhere in the middle; subsequent restart should not cause data corruption. <li> Keep development, staging, and production as similar as possible <br><br> <b>API Evolution</b><br> In the following sections, there are some terms that have a special meaning in the context of the APIs. Their definitions are re-used from https://confluence.eng.vmware.com/display/SUPCR/API+Evolution+Cycle <br><br> <li> <i>Stable</i> - The implementation of API has been battle-tested (has been in production for some time - probably at least a month) API is subject to the semantic versioning model and will follow deprecation policy - https://confluence.eng.vmware.com/display/Standards/API+Deprecation+Policy <li> <i>Experimental</i> - May disappear without notice and is not subject to semantic versioning Implementation of the API is not considered stable nor well tested. Generally this is given to clients to experiment within testing environment. Must not be used in Production. <li> <i>Deprecated</i> - API is expected to be removed within next one or two major version upgrade. The deprecation notice/comment will say when the API will be removed and what alternatives should be used instead. # noqa: E501
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file taurus-datajob-api-3.1.3.tar.gz
.
File metadata
- Download URL: taurus-datajob-api-3.1.3.tar.gz
- Upload date:
- Size: 52.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/57.0.0 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/3.9.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c2c24ad1a8705f8785f0455291840811c8518c276545d53007323727f3110370 |
|
MD5 | a089b0219e6320434acc6c955c678fb8 |
|
BLAKE2b-256 | 889241e2064973372f10f9e5d6a11f4bc4dfbba02243deb8d930250f034325ff |