Skip to main content

Python SDK for coding to the Qubole Data Service API

Project description

[Please visit the project page at https://github.com/qubole/qds-sdk-py]

Qubole Data Service Python SDK

A Python module that provides the tools you need to authenticate with, and use the Qubole Data Service API.

Installation

Run the following command (may need to do this as root):

$ python setup.py install

This should place a command line utility ‘qds.py’ somewhere in your path

$ which qds.py /usr/bin/qds.py

Alternate Virtualenv Installation

Alternatively, if you use virtualenv and virtualenvwrapper, you can do this:

$ mkvirtualenv qubole $ <path-to-virtualenv>/bin/python setup.py install

Which will install qds.py within your virtualenv.

CLI

qds.py allows running Hive, Hadoop, Pig and Shell commands against QDS. Users can run commands synchronously - or submit a command and check it’s status.

$ qds.py -h # will print detailed usage

Examples:

  1. run a hive query and print the results

    $ qds.py –token ‘xxyyzz’ hivecmd run –query “show tables” $ qds.py –token ‘xxyyzz’ hivecmd run –script_location /tmp/myquery $ qds.py –token ‘xxyyzz’ hivecmd run –script_location s3://my-qubole-location/myquery

  2. pass in api token from bash environment variable

    $ export QDS_API_TOKEN=xxyyzz

  3. run the example hadoop command

    $ qds.py hadoopcmd run streaming -files ‘s3n://paid-qubole/HadoopAPIExamples/WordCountPython/mapper.py,s3n://paid-qubole/HadoopAPIExamples/WordCountPython/reducer.py’ -mapper mapper.py -reducer reducer.py -numReduceTasks 1 -input ‘s3n://paid-qubole/default-datasets/gutenberg’ -output ‘s3n://example.bucket.com/wcout’

  4. check the status of command # 12345678

    $ qds.py hivecmd check 12345678 {“status”: “done”, … }

SDK API

An example Python application needs to do the following:

  1. Set the api_token:

    from qds_sdk.qubole import Qubole

    Qubole.configure(api_token=’ksbdvcwdkjn123423’)

  2. Use the Command classes defined in commands.py to execute commands. To run Hive Command:

    from qds_sdk.commands import *

    hc=HiveCommand.create(query=’show tables’) print “Id: %s, Status: %s” % (str(hc.id), hc.status)

example/mr_1.py contains a Hadoop Streaming example

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qds_sdk-1.0.10-beta.tar.gz (14.5 kB view details)

Uploaded Source

File details

Details for the file qds_sdk-1.0.10-beta.tar.gz.

File metadata

File hashes

Hashes for qds_sdk-1.0.10-beta.tar.gz
Algorithm Hash digest
SHA256 83648965c511f1cb523d7a36774a9c5fb18060c1b237cf146393cbf4fd9bb503
MD5 c35aa618e56bc77118479e9a7f0b5d40
BLAKE2b-256 b4226bf1111ab65782995229329b13583c3ded08a15b351a4f6698fa51161725

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page