Python Driver for Apache Drill.
Project description
pydrill
Python Driver for Apache Drill.
Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud Storage
Free software: MIT license
Documentation: https://pydrill.readthedocs.org.
Features
Python 2/3 compatibility,
Mapping Results to internal python types,
Compatibility with Pandas data frame,
Installation
pip install git+git://github.com/PythonicNinja/pydrill.git
Sample usage
from pydrill.client import PyDrill drill = PyDrill(host='localhost', port=8047) if not drill.is_active(): raise ImproperlyConfigured('Please run Drill first') yelp_reviews = drill.query(''' SELECT * FROM `dfs.root`.`./Users/macbookair/Downloads/yelp_dataset_challenge_academic_dataset/yelp_academic_dataset_review.json` LIMIT 5 ''') for result in yelp_reviews: print("%s: %s" %(result['type'], result['date'])) # pandas dataframe df = yelp_reviews.to_dataframe() print(df[df['stars'] > 3])
History
0.0.2 (2016-04-24)
First release on PyPI.
Implementation of metrics/storage/options/stats
Builds are tested by docker container with Apache Drill running
support for pandas with ResultQuery.to_dataframe
0.0.1 (2015-12-28)
Project start
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pydrill-0.0.2.tar.gz
(20.9 kB
view hashes)
Built Distribution
Close
Hashes for pydrill-0.0.2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69615bddd6fe8c9720b04136c24c4692f36c713667f002bc4517865964ae55dd |
|
MD5 | deec580fc2cb4464cafceb5e1a6092b0 |
|
BLAKE2b-256 | 3d932aa3ce2353b73ec181584396e8741b9615f992890e0a0e5edeca199ac4a3 |