Skip to main content

Client-less data retrieval from Hive.

Project description

hivehoney

Extract data from remote Hive to local Windows OS (without Hadoop client).

The most difficult part was figuring out expect+pbrun.

Because there are 2 interactive questions I had to pause after password.

Mode expect+pbrun details are here: https://github.com/hive-scripts/hivehoney/blob/master/expect_pbrun_howto.md

Data access path.

Windows desktop->
               SSH->
                  Linux login->
                       pbrun service login->
                                           kinit
                                           beeline->
                                                   SQL->
                                                       save echo on Windows
                                

Run it like this:

set PROXY_HOST=your_bastion_host

set SERVICE_USER=you_func_user

set LINUX_USER=your_SOID

set LINUX_PWD=your_pwd

python hh.py --query_file=query.sql

query.sql

select * from gfocnnsg_work.pytest LIMIT 1000000;

Result:

  TOTAL BYTES:    60000127

  Elaplsed: 79.637 s

  exit status:  0

  0

  []

  TOTAL Elaplsed: 99.060 s

data_dump.csv

  c:\tmp>dir data_dump.csv



  Directory of c:\tmp

  09/04/2018  12:53 PM        60,000,075 data_dump.csv

                 1 File(s)     60,000,075 bytes

                 0 Dir(s)     321,822,720 bytes free

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for hivehoney, version 1.0.4
Filename, size File type Python version Upload date Hashes
Filename, size hivehoney-1.0.4-py2.py3-none-any.whl (9.3 kB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size hivehoney-1.0.4.tar.gz (7.0 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page