Skip to main content

HiveQL Kernel

Project description

# HiveQL Kernel

### Requirements

If you are going to connect using kerberos:

` sudo apt-get install python3-dev libsasl2-dev libsasl2-2 libsasl2-modules-gssapi-mit `

### Installation

To install the kernel:

` pip install --upgrade hiveqlKernel jupyter hiveql install --user `

### Connection configuration

Two methods are available to connect to a Hive server:

  • Directly inside the notebook

  • Using a configuration file

If the configuration file is present, everytime you run a new HiveQL kernel it uses it, else you must configure your connection inside the notebook. The configuration in the notebook overwrites the one in the configuration file if present.

#### Configure directly in the notebook cells

Inside a Notebook cell, copy&paste this, change the configuration to match your needs, and run it.

` $$ url=hive://<kerberos-username>@<hive-host>:<hive-port>/<db-name> $$ connect_args={"auth": "KERBEROS", "kerberos_service_name": "hive", "configuration": {"tez.queue.name": "myqueue"}} $$ pool_size=5 $$ max_overflow=10 `

These args are passed to sqlalchemy, who registered pyHive as the ‘hive’ SQL back-end. See [github.com/dropbox/PyHive](https://github.com/dropbox/PyHive/#sqlalchemy).

#### Configure using a configuration file

The HiveQL kernel is looking for the configuration file at ~/.hiveql_kernel.conf by default. You can specify another path using HIVE_KERNEL_CONF_FILE.

The contents must be like this (in json format):

` { "url": "hive://<kerberos-username>@<hive-host>:<hive-port>/<db-name>", "connect_args" : { "auth": "KERBEROS", "kerberos_service_name":"hive", "configuration": {"tez.queue.name": "myqueue"}}, "pool_size": 5, "max_overflow": 10, "default_limit": 20, "display_mode": "be" } `

### Usage

Inside a HiveQL kernel you can type HiveQL directly in the cells and it displays a HTML table with the results.

You also have other options, like changing the default display limit (=20) like this :

` $$ default_limit=50 `

Some hive functions are extended. They allow to filter with some patterns.

` SHOW TABLES <pattern> SHOW DATABASES <pattern> `

### Run tests

` python -m pytest `

Have fun!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hiveqlKernel-1.0.20.tar.gz (24.9 kB view hashes)

Uploaded source

Built Distribution

hiveqlKernel-1.0.20-py3-none-any.whl (28.5 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page