Accelerated Discovery Reusable Components.
Project description
Storage Access Reusable Component
This is the implementation of Storage Access Reusable Component. It serves as a wrapper around Dapr and intended to replace all other components' I/O operations.
1. Supported operations
Below is a list of the operations you might intend to perform in your component.
1.1. Upload
Uploads data from a file to an object in a bucket.
Arguments
src
: Name of file to download.dest
: Object name in the bucket.binding
: The name of the binding to perform the operation.
1.2. Download
Downloads data of an object to file.
Arguments
src
: Object name in the bucket.dest
: Name of file to download.binding
: The name of the binding to perform the operation.
2. Dapr configurations
address
: Dapr Runtime gRPC endpoint address.timeout
: Value in seconds we should wait for sidecar to come up
3. Verbose mode
If you want to run the script in verbose mode you can append --verbose
or -v
to the command.
4. Usage
4.1 Pipeline native
Follow the step-by-step method below to add this component in your pipeline, or refer to the full example here workflow/components/storage/dummy_pipeline.py.
- Load the
component.yaml
file using load_component_from_file
io_op = kfp.components.load_component_from_file("path/to/component.yaml")
Or alternatively you can load from Github like this:
file_url = "https://raw.github.ibm.com/Accelerated-Discovery/Discovery-Platform/main/workflow/components/storage/component.yaml"
io_op = kfp.components.load_component_from_url(file_url)
- In your pipeline call the component with the parameters that fir your needs
dummy_task_1 = io_op(
action="download",
src="test.txt",
dest="/mnt/downloaded.txt",
)
- Optional: Use volumes to keep files consistent between pods
vop = kfp.dsl.VolumeOp(
name="volume_creation",
resource_name="mypvc",
size="1Mi",
modes=kfp.dsl.VOLUME_MODE_RWO,
)
dummy_task_1 = io_op(
action="download",
src="test.txt",
dest="/mnt/downloaded.txt",
).add_pvolumes({"/mnt": vop.volume})
dummy_task_2 = io_op(
action="upload",
src="/data/downloaded.txt",
dest="{{workflow.namespace}}/{{workflow.name}}/{{workflow.uid}}/downloaded.txt",
).add_pvolumes({"/data": dummy_task_1.pvolume})
- Compile your pipeline as you're used to, for example
dsl-compile-tekton \
--py <your pipeline file>.py \
--output <your output name>.yaml
4.2 Python module
You can also invoke the manager using native python, which doesn't require a docker image to run. However, the package must be present in you python environment.
4.2.1 Setup
pip install ad-storage-component
4.2.2 Usage
from adstorage import download, upload
download_resp = download(
src, dest,
# binding_name="s3-state", # Or any other binding
# address=None, # endpoint:port
# timeout=300, # in seconds
)
upload_resp = upload(
src, dest,
# binding_name="s3-state", # Or any other binding
# address=None, # endpoint:port
# timeout=300, # in seconds
)
4.3 CLI
$ adsc -h
usage: adsc [-h] --src PATH --dest PATH [--binding NAME] [--address URL] [--timeout SEC] [--verbose] [--version] {download,upload}
Storage Access reusable component.
positional arguments:
{download,upload} action to be performed on data.
optional arguments:
-h, --help show this help message and exit
--verbose, -v run the script in debug mode.
--version show program's version number and exit
action arguments:
--src, -r PATH path of file to perform action on.
--dest, -d PATH object's desired full path in the destination.
--binding, -b NAME the name of the binding as defined in the components.
dapr arguments:
--address, -a URL Dapr Runtime gRPC endpoint address.
--timeout, -t SEC value in seconds we should wait for sidecar to come up.
Note: You can replace
adsc
withpython adstorage/main.py ...
if you don't have the package installed in your python environment.
Examples
- To download an object from S3 run
adsc download \
--src test.txt \
--dest tmp/downloaded.txt \
--verbose
- To upload an object to S3 run
adsc upload \
--src tmp/downloaded.txt \
--dest local/uploaded.txt \
--verbose
5. Publishing
Every change to the python script requires a new Docker image to be published or a PyPi package to be pushed.
5.1 Publish on all ends
To publish a docker image and a pypi image run the following command:
Note: Please make sure each one's documentation below.
make
5.2 Docker
5.2.1 Local registry
With kind I'm using a local registry accessible using 5001
port, running the following command will build and push the image to my local
registry:
make docker-publish
5.2.2 Remote registry
To publish a new image in a remote registry you need to set the registry path variable:
REPO="registry-1.docker.io/distribution" make docker-publish
5.3. PyPi registry
If you have the right (write) permissions, and a correctly-configured $HOME/.pypirc
file, run the following command to publish the package
make pypi-publish
Increment the version
To increment the version, go to adstorage/version.py and increment the version there. Both the setup.py and the CLI
will read the new version correctly.
Note: We will run the
pypi-install
target to confirm the package is installable before publishing it to our PyPi registry.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file ad_components-0.1.4-py3.9.egg
.
File metadata
- Download URL: ad_components-0.1.4-py3.9.egg
- Upload date:
- Size: 17.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35ae9b6d99fd458ec7c61ed271b7a877681479d7fb96cbb9fa4234b62974b63a |
|
MD5 | d6dea879f887cd926cd4337eb2cbfa6c |
|
BLAKE2b-256 | 0a8eedc96692b1343a2fbfeff447157627c3e1466cc08e87e683666069313732 |