Python client for interacting with IBM Cloud Data Engine service
Project description
ibmcloudsql
Allows you to run SQL statements in the IBM Cloud on data stored on object storage::
Building and testing the library locally
Set up Python environment
Run source ./setup_env.sh
which creates and activates a clean virtual Python environment. It uses Python 2.7 by default. Adapt line 2 inside the script if you want a different version.
Install the local code in your Python environment
Run ./_install.sh
.
Test the library locally
- Create a file
ibmcloudsql/test_credentials.py
with the following three lines and your according properties:
apikey='<your IBM Cloud API key>'
instance_crn='<your Data Engine instance CRN>'
result_location='<COS URI of default result location for your SQL result sets>'
...
see details in the template file
- Run
python ibmcloudsql/test.py
.
Packaging and publishing distribution
- Make sure to increase
version=...
insetup.py
before creating a new package. - Run
package.sh
. It will prompt for user and password that must be authorized for packageibmcloudsql
on pypi.org.
Example usage
import ibmcloudsql
my_ibmcloud_apikey = '<your api key here>'
my_instance_crn='<your ibm cloud Data Engine instance CRN here>'
my_target_cos_url='<Cloud Object Storage URL for the SQL result target. Format: cos://<endpoint>/<bucket>/[<prefix>]>'
sqlClient = SQLQuery(my_ibmcloud_apikey, my_instance_crn)
sqlClient.run_sql('SELECT * FROM cos://us-geo/sql/orders.parquet STORED AS PARQUET LIMIT 5 INTO {} STORED AS CSV'.format(my_target_cos_url)).head()
Demo notebook
You can use IBM Watson Studio with the following demo notebook that shows some elaborate examples of using various aspects of ibmcloudsql.
SQLQuery method list
SQLQuery(api_key, instance_crn, target_cos_url=None, token=None, client_info='')
Constructorlogon(force=False, token=None)
Needs to be called before any other method below. It exchanges theapi_key
set at initialization for a temporary oauth token. The invocation is a No-Op if previous logon is less than 5 minutes ago. You can force logon anyway with optional paramaterforce=True
. When you have inititialized the client without anapi_key
but instead specified a customtoken
then you can specify a fresh `token to logon method to update the client with that.submit_sql(sql_text, pagesize=None)
ReturnsjobId
as string. Optional pagesize parameter (in rows) for paginated result objects.wait_for_job(jobId)
Waits for job to end and returns job completion state (eithercompleted
orfailed
)get_result(jobId, pagenumber=None)
returns SQL result data frame for entire result or for specified page of results.list_results(jobId)
Returns a data frame with the list of result objects writtendelete_result(jobId)
Deletes all result set objects in cloud object storage for the given jobIdrename_exact_result(jobId)
Renames single partitioned query result to exact single object name without folder hierarchy.get_job(jobId)
Returns details for the given SQL job as a json objectget_jobs()
Returns the list of recent 30 submitted SQL jobs with all details as a data framerun_sql(sql_text)
Compound method that callssubmit_sql
,wait_for_job
andwait_for_job
in sequenceAsql_ui_link()
Returns browser link for Data Engine web console for currently configured instanceget_cos_summary(cos_url)
Returns summary for stored number of objects and volume for a given cos url as a jsonlist_cos_objects(cos_url)
Returns a data frame with the list of objects found in the given cos urlexport_job_history(cos_url)
Exports new jobs as parquet file to the givencos_url
.export_tags_for_cos_objects(cos_url, export_target_cos_file)
Exports all objects as a parquet file to the givencos_url
that have tags configured along with the value for each tag.
Exceptions
RateLimitedException(message)
raised when jobs can't be submitted due to 429 / Plan limit for concurrent queries has been reached
Constructor options
api_key
: IAM API key. When this parameter is set toNone
then you must specify an own valid IAM otauth token in the parametertoken
.instance_crn
: Data Engine instance CRN identifiertarget_cos_url
: Optional default target URL. Don't use when you want to provide target URL in SQL statement text.token
: Optional custom IAM oauth token. When you specify this then you must setapi_key
parameter toNone
.client_info
: Optional string to identify your client application in IBM Cloud for PD reasons.max_tries
: Optional integer to specify maximum attempts when dealing with request rate limit. Default value is1
, which means it will through exceptionRateLimitedException
when response status code is429
. It will enable exponential backoff when specifying any positive number greater than1
. For instance, givenmax_tries=5
, assuming it will get response status code429
for 4 times until the 5th attempt will get response status code201
, the wait time will be2s
,4s
,8s
and16s
for each attempts.
Limitations
Data Engine Python SDK does not support Pyinstaller.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ibmcloudsql-0.5.14.tar.gz
(61.3 kB
view details)
File details
Details for the file ibmcloudsql-0.5.14.tar.gz
.
File metadata
- Download URL: ibmcloudsql-0.5.14.tar.gz
- Upload date:
- Size: 61.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 376aa49f17aa72181e124e2452eb4a429e2318adb889713879c16b2fd010193a |
|
MD5 | da80b2db56c893772ce046d528d1442b |
|
BLAKE2b-256 | 697edebf025cf32a9004c77beb028a9b0b507efe86f8af035b4f3dc0ab519546 |