Skip to main content

A Python Package for interacting with Cloudera Data Engineering Clusters

Project description

cdepy Package

cdepy is a package for interacting with Cludera Data Engineering Virtual Clusters.

You can find out more about Cloudera Data Engineering in the Cloudera Documentation.

Usage

You can install this package using

pip install cdepy

Features

  • CDE Resources: create resources of type Files and Python-Environment
  • CDE Jobs: create jobs of type Airflow and Spark
  • Job Observability: monitor job status

Examples

from cdepy import cdeconnection
from cdepy import cdejob
from cdepy import cdemanager
from cdepy import cderesource

Establish Connection to CDE Virtual Cluster

JOBS_API_URL = "https://<YOUR-CLUSTER>.cloudera.site/dex/api/v1"
WORKLOAD_USER = "<Your-CDP-Workload-User>"
WORKLOAD_PASSWORD = "<Your-CDP-Workload-Password>"

myCdeConnection = cdeconnection.CdeConnection(JOBS_API_URL, WORKLOAD_USER, WORKLOAD_PASSWORD)

myCdeConnection.setToken()

Create CDE Files Resource Definition

CDE_RESOURCE_NAME = "myFilesCdeResource"
myCdeFilesResource = cderesource.CdeFilesResource(CDE_RESOURCE_NAME)
myCdeFilesResourceDefinition = myCdeFilesResource.createResourceDefinition()

Create a CDE Spark Job Definition

CDE_JOB_NAME = "myCdeSparkJob"
APPLICATION_FILE_NAME = "pysparksql.py"

myCdeSparkJob = cdejob.CdeSparkJob(myCdeConnection)
myCdeSparkJobDefinition = myCdeSparkJob.createJobDefinition(CDE_JOB_NAME, CDE_RESOURCE_NAME, APPLICATION_FILE_NAME)

Create Resource and Job in CDE Cluster

LOCAL_FILE_PATH = "examples"
LOCAL_FILE_NAME = "pysparksql.py"

myCdeClusterManager = cdemanager.CdeClusterManager(myCdeConnection)


myCdeClusterManager.createResource(myCdeFilesResourceDefinition)
myCdeClusterManager.uploadFile(CDE_RESOURCE_NAME, LOCAL_FILE_PATH, LOCAL_FILE_NAME)

myCdeClusterManager.createJob(myCdeSparkJobDefinition)

Run and Validate CDE Job

myCdeClusterManager.runJob(CDE_JOB_NAME)
myCdeClusterManager.listJobRuns()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cdepy-0.1.4.tar.gz (6.3 kB view details)

Uploaded Source

Built Distribution

cdepy-0.1.4-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file cdepy-0.1.4.tar.gz.

File metadata

  • Download URL: cdepy-0.1.4.tar.gz
  • Upload date:
  • Size: 6.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for cdepy-0.1.4.tar.gz
Algorithm Hash digest
SHA256 55eb318a612b0c3ab8cc7f48a13b2dd0fc8004ea61a270041c9092ca07bf884c
MD5 9b30d502e86d6f15ece8beb35fd62f5d
BLAKE2b-256 8c74e5aaf39757d8038ea2e9853d99854ef204b959aba511abea0dfbf782c3ec

See more details on using hashes here.

File details

Details for the file cdepy-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: cdepy-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 7.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for cdepy-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 a277e71447d4467255fe1f76d9d830073861bdf9a554cde62c6055e603705026
MD5 2038512f98438fc29a78dec776aa417e
BLAKE2b-256 3d6b52d8188bccdb5e173aa85304a50314aa79d41f03432c58891442202b9d7b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page