Skip to main content

Accelerated Discovery Reusable Components.

Project description

Accelerated Discovery Reusable Components

The central implementation of Accelerated Discover Reusable Components. It serves as a wrapper around client libraries we use locally like Dapr and MLflow.

1.Installation

All components will be availble using

pip install dapr ad-components

pip config

Since the package is hosted in a private registry, you need to make sure to tell pip to look for the package outside pypi.org.

mkdir -p $HOME/.pip

cat << EOF > $HOME/.pip/pip.conf
[global]
extra-index-url = https://$ARTIFACTORY_USERNAME:$ARTIFACTORY_TOKEN@na.artifactory.swg-devops.com/artifactory/api/pypi/res-discovery-platform-team-pypi-local/simple
EOF

# If you decide to choose a different path for the pip conf, make sure to tell pip about it
# echo 'export PIP_CONFIG_FILE="$HOME/.pip/pip.conf"' >> ~/.zprofile

CLI

Here's an example usage of the CLI

usage: adc [-h] [--verbose] [--version] {<component>} ...

Accelerated Discovery reusable components.

positional arguments:
  <component>    the component that you want to trigger.

optional arguments:
  -h, --help     show this help message and exit.
  --version      show program's version number and exit.

2. Usage

2.0. In your pipeline

To use a component in your pipeline, you need to run it in a Step context

from ad.step import DaprStep
from ad.storage import download, upload

with DaprStep():
    resp = download(download_src, download_dest, binding_name=binding)
    print(f"download resp: {resp}")

    resp = upload(upload_src, upload_dest, binding_name=binding)
    print(f"upload resp: {resp}")

Running the components inside a step will make sure the client dependencies are handled correctly.

2.1. Storage

2.1.2. Python module

You can invoke the manager using native python. Please note that the package must be present in you python environment.

from ad.storage import download, upload

download_resp = download(
    src, dest,
    # binding_name="s3-state",  # Or any other binding
)

upload_resp = upload(
    src, dest,
    # binding_name="s3-state",  # Or any other binding
)

2.1.3. CLI

usage: adc storage [-h] --src PATH --dest PATH [--binding NAME] [--timeout SEC] {download,upload}

positional arguments:
  {download,upload}     action to be performed on data.

optional arguments:
  -h, --help            show this help message and exit

action arguments:
  --src PATH, -r PATH   path of file to perform action on.
  --dest PATH, -d PATH  object's desired full path in the destination.
  --binding NAME, -b NAME
                        the name of the binding as defined in the components.

dapr arguments:
  --timeout SEC, -t SEC
                        value in seconds we should wait for sidecar to come up.

Note: You can replace adc with python ad/main.py ... if you don't have the package installed in your python environment.

Examples
  1. To download an object from S3 run
adc storage download \
    --src test.txt \
    --dest tmp/downloaded.txt
  1. To upload an object to S3 run
adc storage upload \
    --src tmp/downloaded.txt \
    --dest local/uploaded.txt

3. Supported components

3.1. Storage

3.1.1. Supported operations

Below is a list of the operations you might intend to perform in your component.

Upload

Uploads data from a file to an object in a bucket.

Arguments
  • src: Name of file to download.
  • dest: Object name in the bucket.
  • binding: The name of the binding to perform the operation.
Download

Downloads data of an object to file.

Arguments
  • src: Object name in the bucket.
  • dest: Name of file to download.
  • binding: The name of the binding to perform the operation.
Dapr configurations
  • address: Dapr Runtime gRPC endpoint address.
  • timeout: Value in seconds we should wait for sidecar to come up

4. Publishing

Every change to the python script requires:

  • A new version to be pushed PyPi registry.
  • A docker image to pushed to the cluster. (This is temprorary until kfp 2.0 is released and kfp-tekton starts supporting it)

If you have the right (write) permissions, and a correctly-configured $HOME/.pypirc file and docker registry, run the following command to publish the package

# pypi registry credentials
export ARTIFACTORY_USERNAME=<username>@ibm.com
export ARTIFACTORY_TOKEN=<token>

# To push to an OpenShift cluster
export PROJECT=kubeflow
export REPO_ROUTE="$(oc get route/default-route -n openshift-image-registry --template='{{ .spec.host }}'):443"
export REPO=${REPO_ROUTE}/${PROJECT}

make

4.1. Increment the version

To increment the version, go to adstorage/version.py and increment the version there. Both the setup.py and the CLI will read the new version correctly.

4.2 Configure PyPi registry

To be able to push to the package to our private registry, you need to tell PyPi about it. This one-liner command will take care of it for you

cat << EOF > $HOME/.pypirc
[distutils]
index-servers =
    ad-discovery
    pypi

[pypi]
repository: https://pypi.org/pypi

[ad-discovery]
repository: https://na.artifactory.swg-devops.com/artifactory/api/pypi/res-discovery-platform-team-pypi-local
username: $ARTIFACTORY_USERNAME
password: $ARTIFACTORY_TOKEN
EOF

4.3 Configure Docker registry

Follow these steps to log into your OpenShift cluster registry

# Replace the following command with the command given from your cluster.
# Ignore this command if you're already logged in.
oc login --token=<oc token> --server=https://api.adp-rosa-2.5wcf.p1.openshiftapps.com:6443


# If this is the first push to your cluster, you need to to create an image stream
oc create is ad

# Change the server value if you're in a different cluster ()
SERVER="$(oc get route/default-route -n openshift-image-registry --template='{{ .spec.host }}'):443"
docker login -u `oc whoami` -p `oc whoami --show-token` $SERVER

Note: Both the pip package and the docker image will fetch the version from ad/version.py file, so make sure to increment before pushing or you'll override the previous version.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

ad_components-0.1.26-py3.9.egg (24.0 kB view details)

Uploaded Source

ad_components-0.1.26-py3-none-any.whl (12.0 kB view details)

Uploaded Python 3

File details

Details for the file ad_components-0.1.26-py3.9.egg.

File metadata

  • Download URL: ad_components-0.1.26-py3.9.egg
  • Upload date:
  • Size: 24.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.12

File hashes

Hashes for ad_components-0.1.26-py3.9.egg
Algorithm Hash digest
SHA256 b1d552ee7f09001d86b790ce04d5258b501c3a72fadc276706d96d5f069ca117
MD5 f623ee42bfe9164ae8f451328b6b59d7
BLAKE2b-256 c9785631d97f637bbd1c98b31b1d9ce425b7fb4ae51f694214f2a2bbbd3b911b

See more details on using hashes here.

File details

Details for the file ad_components-0.1.26-py3-none-any.whl.

File metadata

File hashes

Hashes for ad_components-0.1.26-py3-none-any.whl
Algorithm Hash digest
SHA256 e973615491b6afe16ea32b38c0556b52b604b2fc0c7202b8f42f86f32345c7a9
MD5 fecf0708dc1ee7675549153cc4cf1bf3
BLAKE2b-256 9cb419997e693e12d665a56d30f84451f83cdcfbdc85c09f08a6a6bc615f634d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page