Skip to main content

A Python Package for interacting with Cloudera Data Engineering Clusters

Project description

cdepy Package

cdepy is a package for interacting with Cludera Data Engineering Virtual Clusters.

You can find out more about Cloudera Data Engineering in the Cloudera Documentation.

Usage

You can install this package using

pip install cdepy

Features

  • CDE Resources: create resources of type Files and Python-Environment
  • CDE Jobs: create jobs of type Airflow and Spark
  • Job Observability: monitor job status

Examples

from cdepy import cdeconnection
from cdepy import cdejob
from cdepy import cdemanager
from cdepy import cderesource

Establish Connection to CDE Virtual Cluster

JOBS_API_URL = "https://<YOUR-CLUSTER>.cloudera.site/dex/api/v1"
WORKLOAD_USER = "<Your-CDP-Workload-User>"
WORKLOAD_PASSWORD = "<Your-CDP-Workload-Password>"

myCdeConnection = cdeconnection.CdeConnection(JOBS_API_URL, WORKLOAD_USER, WORKLOAD_PASSWORD)

myCdeConnection.setToken()

Create CDE Files Resource Definition

CDE_RESOURCE_NAME = "myFilesCdeResource"
myCdeFilesResource = cderesource.CdeFilesResource(CDE_RESOURCE_NAME)
myCdeFilesResourceDefinition = myCdeFilesResource.createResourceDefinition()

Create a CDE Spark Job Definition

CDE_JOB_NAME = "myCdeSparkJob"
APPLICATION_FILE_NAME = "pysparksql.py"

myCdeSparkJob = cdejob.CdeSparkJob(myCdeConnection)
myCdeSparkJobDefinition = myCdeSparkJob.createJobDefinition(CDE_JOB_NAME, CDE_RESOURCE_NAME, APPLICATION_FILE_NAME)

Create Resource and Job in CDE Cluster

LOCAL_FILE_PATH = "examples"
LOCAL_FILE_NAME = "pysparksql.py"

myCdeClusterManager = cdemanager.CdeClusterManager(myCdeConnection)


myCdeClusterManager.createResource(myCdeFilesResourceDefinition)
myCdeClusterManager.uploadFile(CDE_RESOURCE_NAME, LOCAL_FILE_PATH, LOCAL_FILE_NAME)

myCdeClusterManager.createJob(myCdeSparkJobDefinition)

Run and Validate CDE Job

myCdeClusterManager.runJob(CDE_JOB_NAME)
myCdeClusterManager.listJobRuns()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cdepy-0.1.5.tar.gz (6.6 kB view details)

Uploaded Source

Built Distribution

cdepy-0.1.5-py3-none-any.whl (8.0 kB view details)

Uploaded Python 3

File details

Details for the file cdepy-0.1.5.tar.gz.

File metadata

  • Download URL: cdepy-0.1.5.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for cdepy-0.1.5.tar.gz
Algorithm Hash digest
SHA256 943a1d695fa28418115353a1eae2200b58c136eb8f3ba28c093fca8114a83632
MD5 d6877b0ad552030f0ac1451fde92949a
BLAKE2b-256 c135328312e377c0195ed625672a707e8d694d521fb3568c5c6668223fe88c4f

See more details on using hashes here.

File details

Details for the file cdepy-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: cdepy-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 8.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for cdepy-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 51e41ef0ec0092b6126e6c5203fa7425b58985cf36d51024840180a9b8e51ec4
MD5 4b944ffd0a2f518fd5c756c2c517c5d7
BLAKE2b-256 fbafa56823e4f982c2ce461f15cd1fc80a107d8e83ab333ff7820e162abd0f16

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page