Skip to main content

Python wrapper for Apache Drill's REST API

Project description

drillpy is a Python wrapper for Apache Drill’s REST API, which lets you query and import data to Python directly from a working Drill cluster/drillbit instance. It is built on top on requests, numpy and pandas.

Installation

pip install drillpy

Usage

drillpy follows the Python Database API Specification v2.0, so it’s usage is pretty similar to the one you can find e.g. in the builtin sqlite3 module from CPython’s Standard Library.

As with sqlite3, you should start by creating a Connection object, using drillpy.connect():

from drillpy import connect

con = connect(host=”some_drillbit_host”, db=”some_database_managed_by_drill”, port=8047)

Once created, you must create a Cursor:

cur = con.cursor()

Now you can use this cursor to write SQL queries against your Drill cluster. Parameter substitution is handled by question marks ? (as in sqlite3):

query = cur.execute(“SELECT * FROM mytable WHERE somecolumn > ? AND someothercolumn < ? LIMIT 10”, (value, other_value))

Results are returned in a pandas DataFrame, with NaNs in missing values. Column types are inferred automatically. You can retreive results with fetchone(), fetchmany(size) and fetchall(). With fetchone(), a pandas Series is returned rather than a DataFrame:

returned_df = query.fetchall()

Keep in mind that drillpy cannot insert new data in your tables/databases, since Drill itself is a querying engine meant to be used for exploratory data analysis and BI/visualization tools.

Once finished, you should call Connection.close():

con.close()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for drillpy, version 0.2.0
Filename, size File type Python version Upload date Hashes
Filename, size drillpy-0.2.0-py2.py3-none-any.whl (5.9 kB) File type Wheel Python version py2.py3 Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page