Skip to main content

Python SDK for coding to the Qubole Data Service API

Project description

Build Status

A Python module that provides the tools you need to authenticate with, and use the Qubole Data Service API.

Installation

From PyPI

The SDK is available on PyPI.

$ pip install qds-sdk

From source

  • Get the source code:

  • Run the following command (may need to do this as root):

    $ python setup.py install
  • Alternatively, if you use virtualenv, you can do this:

    $ cd qds-sdk-py
    $ virtualenv venv
    $ source venv/bin/activate
    $ python setup.py install

This should place a command line utility qds.py somewhere in your path

$ which qds.py
/usr/bin/qds.py

CLI

qds.py allows running Hive, Hadoop, Pig, Presto and Shell commands against QDS. Users can run commands synchronously - or submit a command and check its status.

$ qds.py -h  # will print detailed usage

Examples:

  1. run a hive query and print the results

    $ qds.py --token 'xxyyzz' hivecmd run --query "show tables"
    $ qds.py --token 'xxyyzz' hivecmd run --script_location /tmp/myquery
    $ qds.py --token 'xxyyzz' hivecmd run --script_location s3://my-qubole-location/myquery
  2. pass in api token from bash environment variable

    $ export QDS_API_TOKEN=xxyyzz
  3. run the example hadoop command

    $ qds.py hadoopcmd run streaming -files 's3n://paid-qubole/HadoopAPIExamples/WordCountPython/mapper.py,s3n://paid-qubole/HadoopAPIExamples/WordCountPython/reducer.py' -mapper mapper.py -reducer reducer.py -numReduceTasks 1 -input 's3n://paid-qubole/default-datasets/gutenberg' -output 's3n://example.bucket.com/wcout'
  4. check the status of command # 12345678

    $ qds.py hivecmd check 12345678
    {"status": "done", ... }
  5. If you are hitting api_url other than api.qubole.com, then you can pass it in command line as --url or set in as env variable

    $ qds.py --token 'xxyyzz' --url https://<env>.qubole.com/api hivecmd ...
    
    or
    
    $ export QDS_API_URL=https://<env>.qubole.com/api

SDK API

An example Python application needs to do the following:

  1. Set the api_token and api_url (if api_url other than api.qubole.com):

    from qds_sdk.qubole import Qubole
    
    Qubole.configure(api_token='ksbdvcwdkjn123423')
    
    # or
    
    Qubole.configure(api_token='ksbdvcwdkjn123423', api_url='https://<env>.qubole.com/api')
  2. Use the Command classes defined in commands.py to execute commands. To run Hive Command:

    from qds_sdk.commands import *
    
    hc=HiveCommand.create(query='show tables')
    print "Id: %s, Status: %s" % (str(hc.id), hc.status)

example/mr_1.py contains a Hadoop Streaming example

Reporting Bugs and Contributing Code

  • Want to report a bug or request a feature? Please open an issue.

  • Want to contribute? Fork the project and create a pull request with your changes against unreleased branch.

Where are the maintainers ?

Qubole was acquired. All the maintainers of this repo have moved on. Some of the employees founded ClearFeed. Others are at big data teams in Microsoft, Amazon et al.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qds_sdk-1.17.0.tar.gz (108.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

qds_sdk-1.17.0-py3-none-any.whl (94.9 kB view details)

Uploaded Python 3

File details

Details for the file qds_sdk-1.17.0.tar.gz.

File metadata

  • Download URL: qds_sdk-1.17.0.tar.gz
  • Upload date:
  • Size: 108.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for qds_sdk-1.17.0.tar.gz
Algorithm Hash digest
SHA256 1de5d17995c6e0ea6ca91378aa953694248018268d300cbdb5cd4673ea101477
MD5 32f78fdcaedcd6d949410fe51093c956
BLAKE2b-256 074ad8bbee639ee1a70fcd03774b6f4aa29dc75c9d733d79121028dde60458c8

See more details on using hashes here.

File details

Details for the file qds_sdk-1.17.0-py3-none-any.whl.

File metadata

  • Download URL: qds_sdk-1.17.0-py3-none-any.whl
  • Upload date:
  • Size: 94.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for qds_sdk-1.17.0-py3-none-any.whl
Algorithm Hash digest
SHA256 383840feb8972786ab604e8004bf0cd7dc035cc65bfd173593d1d4bd45e4815c
MD5 834ebddf7c361d0a16b1e12474aa7268
BLAKE2b-256 b12126eac9beae9dcf469b9e874c95139e43330e8e1557b2c62e106bb1f7cbcf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page