Skip to main content

A Python Package for interacting with Cloudera Data Engineering Clusters

Project description

cdepy

cdepy is a package for interacting with Cludera Data Engineering Virtual Clusters.

You can find out more about Cloudera Data Engineering in the Cloudera Documentation.

Installation

You can install this package using

pip install cdepy

Features

  • CDE Resources: create resources of type Files and Python-Environment
  • CDE Jobs: create jobs of type Airflow and Spark
  • Job Observability: monitor job status

Usage

from cdepy import cdeconnection
from cdepy import cdejob
from cdepy import cdemanager
from cdepy import cderesource

Establish Connection to CDE Virtual Cluster

JOBS_API_URL = "https://<YOUR-CLUSTER>.cloudera.site/dex/api/v1"
WORKLOAD_USER = "<Your-CDP-Workload-User>"
WORKLOAD_PASSWORD = "<Your-CDP-Workload-Password>"

myCdeConnection = cdeconnection.CdeConnection(JOBS_API_URL, WORKLOAD_USER, WORKLOAD_PASSWORD)

myCdeConnection.setToken()

Create CDE Files Resource Definition

CDE_RESOURCE_NAME = "myFilesCdeResource"
myCdeFilesResource = cderesource.CdeFilesResource(CDE_RESOURCE_NAME)
myCdeFilesResourceDefinition = myCdeFilesResource.createResourceDefinition()

Create a CDE Spark Job Definition

CDE_JOB_NAME = "myCdeSparkJob"
APPLICATION_FILE_NAME = "pysparksql.py"

myCdeSparkJob = cdejob.CdeSparkJob(myCdeConnection)
myCdeSparkJobDefinition = myCdeSparkJob.createJobDefinition(CDE_JOB_NAME, CDE_RESOURCE_NAME, APPLICATION_FILE_NAME)

Create Resource and Job in CDE Cluster

LOCAL_FILE_PATH = "examples"
LOCAL_FILE_NAME = "pysparksql.py"

myCdeClusterManager = cdemanager.CdeClusterManager(myCdeConnection)


myCdeClusterManager.createResource(myCdeFilesResourceDefinition)
myCdeClusterManager.uploadFile(CDE_RESOURCE_NAME, LOCAL_FILE_PATH, LOCAL_FILE_NAME)

myCdeClusterManager.createJob(myCdeSparkJobDefinition)

Run and Validate CDE Job

myCdeClusterManager.runJob(CDE_JOB_NAME)
myCdeClusterManager.listJobRuns()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cdepy-0.1.2.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

cdepy-0.1.2-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file cdepy-0.1.2.tar.gz.

File metadata

  • Download URL: cdepy-0.1.2.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for cdepy-0.1.2.tar.gz
Algorithm Hash digest
SHA256 11297631056d1b54d77b6176ae24b3d6f947e6ab040e9dd3ef2edd7bd91e45f6
MD5 fde2307cc7d06a0494fdf4472dab4581
BLAKE2b-256 a0e43deaa4ced0c4c70bf65c8ff46d37855523adaa1558fe154e3ac3bc05e8b1

See more details on using hashes here.

File details

Details for the file cdepy-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: cdepy-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 7.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for cdepy-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3358844fa60bce3c0a1db9c228308eb250d5295cba849f2368bbf1088c4292f6
MD5 ad1910bc4723fc67a9614c3b9fe4d015
BLAKE2b-256 8034a09537f80d68f1e0b0bfa20e0d60b21c9eb16b57df8c737a035a6aa06c44

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page