Skip to main content

Data Cloud Custom Code SDK

Project description

Data Cloud Custom Code SDK

license

This package provides a development kit for creating custom data transformations in Data Cloud. It allows you to write your own data processing logic in Python while leveraging Data Cloud's infrastructure for data access and running data transformations, mapping execution into Data Cloud data structures like Data Model Objects and Data Lake Objects.

More specifically, this codebase gives you ability to test code locally before pushing to Data Cloud's remote execution engine, greatly reducing how long it takes to develop.

Use of this project with Salesforce is subject to the TERMS OF USE

Prerequisites

Installation

The SDK can be downloaded directly from PyPI with pip:

pip install salesforce-data-customcode

You can verify it was properly installed via CLI:

datacustomcode version

Quick start

Ensure you have all the prerequisites prepared on your machine.

To get started, create a directory and initialize a new project with the CLI:

mkdir datacloud && cd datacloud
python3.11 -m venv .venv
source .venv/bin/activate
pip install salesforce-data-customcode
datacustomcode init my_package

This will yield all necessary files to get started:

.
├── Dockerfile
├── README.md
├── requirements.txt
├── requirements-dev.txt
├── payload
│   ├── config.json
│   ├── entrypoint.py
├── jupyterlab.sh
└── requirements.txt
  • Dockerfile (Do not update) – Development container emulating the remote execution environment.
  • requirements-dev.txt (Do not update) – These are the dependencies for the development environment.
  • jupyterlab.sh (Do not update) – Helper script for setting up Jupyter.
  • requirements.txt – Here you define the requirements that you will need for your script.
  • payload – This folder will be compressed and deployed to the remote execution environment.
    • config.json – This config defines permissions on the back and can be generated programmatically with scan CLI method.
    • entrypoint.py – The script that defines the data transformation logic.

A functional entrypoint.py is provided so you can run once you've configured your connected app:

cd my_package
datacustomcode configure
datacustomcode run ./payload/entrypoint.py

[!IMPORTANT] The example entrypoint.py requires a Account_Home__dll DLO to be present. And in order to deploy the script (next step), the output DLO (which is Account_Home_copy__dll in the example entrypoint.py) also needs to exist and be in the same dataspace as Account_Home__dll.

After modifying the entrypoint.py as needed, using any dependencies you add in the .venv virtual environment, you can run this script in Data Cloud:

datacustomcode scan ./payload/entrypoint.py
datacustomcode deploy --path ./payload --name my_custom_script

[!TIP] The deploy process can take several minutes. If you'd like more feedback on the underlying process, you can add --debug to the command like datacustomcode --debug deploy --path ./payload --name my_custom_script

You can now use the Salesforce Data Cloud UI to find the created Data Transform and use the Run Now button to run it. Once the Data Transform run is successful, check the DLO your script is writing to and verify the correct records were added.

API

You entry point script will define logic using the Client object which wraps data access layers.

You should only need the following methods:

  • read_dlo(name) – Read from a Data Lake Object by name
  • read_dmo(name) – Read from a Data Model Object by name
  • write_to_dlo(name, spark_dataframe, write_mode) – Write to a Data Model Object by name with a Spark dataframe
  • write_to_dmo(name, spark_dataframe, write_mode) – Write to a Data Lake Object by name with a Spark dataframe

For example:

from datacustomcode import Client

client = Client()

sdf = client.read_dlo('my_DLO')
# some transformations
# ...
client.write_to_dlo('output_DLO')

[!WARNING] Currently we only support reading from DMOs and writing to DMOs or reading from DLOs and writing to DLOs, but they cannot mix.

CLI

The Data Cloud Custom Code SDK provides a command-line interface (CLI) with the following commands:

Global Options

  • --debug: Enable debug-level logging

Commands

datacustomcode version

Display the current version of the package.

datacustomcode configure

Configure credentials for connecting to Data Cloud.

Options:

  • --profile TEXT: Credential profile name (default: "default")
  • --username TEXT: Salesforce username
  • --password TEXT: Salesforce password
  • --client-id TEXT: Connected App Client ID
  • --client-secret TEXT: Connected App Client Secret
  • --login-url TEXT: Salesforce login URL

datacustomcode deploy

Deploy a transformation job to Data Cloud.

Options:

  • --profile TEXT: Credential profile name (default: "default")
  • --path TEXT: Path to the code directory (default: ".")
  • --name TEXT: Name of the transformation job [required]
  • --version TEXT: Version of the transformation job (default: "0.0.1")
  • --description TEXT: Description of the transformation job (default: "")

datacustomcode init

Initialize a new development environment with a template.

Argument:

  • DIRECTORY: Directory to create project in (default: ".")

datacustomcode scan

Scan a Python file to generate a Data Cloud configuration.

Argument:

  • FILENAME: Python file to scan

Options:

  • --config TEXT: Path to save the configuration file (default: same directory as FILENAME)
  • --dry-run: Preview the configuration without saving to a file

datacustomcode run

Run an entrypoint file locally for testing.

Argument:

  • ENTRYPOINT: Path to entrypoint Python file

Options:

  • --config-file TEXT: Path to configuration file
  • --dependencies TEXT: Additional dependencies (can be specified multiple times)

datacustomcode zip

Zip a transformation job in preparation to upload to Data Cloud.

Options:

  • --path TEXT: Path to the code directory (default: ".")

Prerequisite details

Creating a connected app

  1. Log in to salesforce as an admin. In the top right corner, click on the gear icon and go to Setup
  2. In the left hand side, search for "App Manager" and select the App Manager underneath Apps
  3. Click on New Connected App in the upper right
  4. Fill in the required fields within the Basic Information section
  5. Under the API (Enable OAuth Settings) section:
    1. Click on the checkbox to Enable OAuth Settings.
    2. Provide a callback URL like http://localhost:55555/callback
    3. In the Selected OAuth Scopes, make sure that refresh_token, api, cdp_query_api, cdp_profile_api is selected.
    4. Click on Save to save the connected app
  6. From the detail page that opens up afterwards, click the "Manage Consumer Details" button to find your client id and client secret
  7. Go back to Setup, then OAuth and OpenID Connect Settings, and enable the "Allow OAuth Username-Password Flows" option

You now have all fields necessary for the datacustomcode configure command.

Other docs

Troubleshooting

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

salesforce_data_customcode-0.1.5.tar.gz (29.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

salesforce_data_customcode-0.1.5-py3-none-any.whl (42.4 kB view details)

Uploaded Python 3

File details

Details for the file salesforce_data_customcode-0.1.5.tar.gz.

File metadata

  • Download URL: salesforce_data_customcode-0.1.5.tar.gz
  • Upload date:
  • Size: 29.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.11.12 Linux/6.11.0-1012-azure

File hashes

Hashes for salesforce_data_customcode-0.1.5.tar.gz
Algorithm Hash digest
SHA256 7a426edcb456c7664c8ddd8572885cf628a4863fd1277918f2cb89cb3a118b6b
MD5 c902e1ac1fabb715ce446b36e8b43771
BLAKE2b-256 1179475759ee0350d27db3be615f9f313ee0bd7e9b3f5b568331fb06318b6298

See more details on using hashes here.

File details

Details for the file salesforce_data_customcode-0.1.5-py3-none-any.whl.

File metadata

File hashes

Hashes for salesforce_data_customcode-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 70e8426988b29ea8f705c10b97876dcb1fa713bb1f361bfee715b5d0e9d0b0fa
MD5 880d7cdfd8c00f48cc2a02c159f2ae1e
BLAKE2b-256 56d68666ac06b6e6365f3d41c34d1afb971fdfba42d779247bd9a52d1b4f9fa8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page