Skip to main content

Python interface to iomete (Hive)

Project description

py-hive-iomete is a collection of Python DB-API and SQLAlchemy interfaces for iomete hive.

Usage

DB-API

from pyhive import hive

connection = hive.connect(
    host="<data_plane_host>",
    port=<data_plane_port>,
    scheme="http", # or "https"
    lakehouse="<lakehouse_cluster_name>",
    data_plane=None # or data_plane (namespace)
    database="default",
    username="<username>",
    password="<password>"
)

cursor = connection.cursor()
cursor.execute("SELECT * FROM my_awesome_data LIMIT 10")

print(cursor.fetchone())
print(cursor.fetchall())

DB-API (asynchronous)

from pyhive import hive
from TCLIService.ttypes import TOperationState

connection = hive.connect(
    host="<data_plane_host>",
    port=<data_plane_port>,
    scheme="http", # or "https"
    lakehouse="<lakehouse_cluster_name>",
    data_plane=None # or data_plane (namespace)
    database="default",
    username="<username>",
    password="<password>"
)

cursor = connection.cursor()

cursor.execute("SELECT * FROM my_awesome_data LIMIT 10", async_=True)

status = cursor.poll().operationState

while status in (TOperationState.INITIALIZED_STATE, TOperationState.RUNNING_STATE):
    logs = cursor.fetch_logs()
    for message in logs:
        print(message)

    # If needed, an asynchronous query can be cancelled at any time with:
    # cursor.cancel()

    status = cursor.poll().operationState

print(cursor.fetchall())

SQLAlchemy

First install this package to register it with SQLAlchemy (see setup.py).

from sqlalchemy.engine import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.schema import *

# Possible dialects (hive and iomete are both operate identically):
# hive+http
# hive+https
# iomete+http
# iomete+https
engine = create_engine(
    'iomete+https://<username>:<password>@<data_plane_host>:<data_plane_port>/<database>?lakehouse=<lakehouse_cluster_name>')

# or with data_plane specified
# engine = create_engine(
#    'iomete+https://<username>:<password>@<data_plane_host>:<data_plane_port>/<database>?lakehouse=<lakehouse_cluster_name>&data_plane=<data_plane>')

# Alternatively, "hive" driver could be used as well
# engine = create_engine(
#    'hive+https://<username>:<password>@<data_plane_host>:<data_plane_port>/<database>?lakehouse=<lakehouse_cluster_name>')

session = sessionmaker(bind=engine)()
records = session.query(Table('my_awesome_data', MetaData(bind=engine), autoload=True)) \
    .limit(10) \
    .all()
print(records)

Note: query generation functionality is not exhaustive or fully tested, but there should be no problem with raw SQL.

Requirements

Install using

  • pip install 'py-hive-iomete' for the DB-API interface

  • pip install 'py-hive-iomete[sqlalchemy]' for the SQLAlchemy interface

py-hive-iomete works with

  • Python 2.7 / Python 3

Changelog

See https://github.com/iomete/py-hive-iomete/releases.

Contributing

  • Changes must come with tests, with the exception of trivial things like fixing comments. See .travis.yml for the test environment setup.

  • Notes on project scope:

    • This project is intended to be a minimal iomete (hive) client that does that one thing and nothing else. Features that can be implemented on top of py-hive-iomete, such integration with your favorite data analysis library, are likely out of scope.

    • We prefer having a small number of generic features over a large number of specialized, inflexible features.

Updating TCLIService

The TCLIService module is autogenerated using a TCLIService.thrift file. To update it, the generate.py file can be used: python generate.py <TCLIServiceURL>. When left blank, the version for Hive 2.3 will be downloaded.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_hive_iomete-2.1.5.tar.gz (45.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

py_hive_iomete-2.1.5-py3-none-any.whl (50.8 kB view details)

Uploaded Python 3

File details

Details for the file py_hive_iomete-2.1.5.tar.gz.

File metadata

  • Download URL: py_hive_iomete-2.1.5.tar.gz
  • Upload date:
  • Size: 45.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for py_hive_iomete-2.1.5.tar.gz
Algorithm Hash digest
SHA256 89d5e9798b0dc2335ddd0fea07c39520b50ab3d2adb46522f3b6d5f0f856fda6
MD5 b99a94bbb7f024d076b260afc69e64fc
BLAKE2b-256 be3375d811064beabe4e5961668fcce074f0ee8322262d5549cf7fbdb82a2123

See more details on using hashes here.

File details

Details for the file py_hive_iomete-2.1.5-py3-none-any.whl.

File metadata

  • Download URL: py_hive_iomete-2.1.5-py3-none-any.whl
  • Upload date:
  • Size: 50.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for py_hive_iomete-2.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 34aed79c8e0b28020790bd04a3cf4dc6d785cac39fd34f9d7a88a90eba5bcf34
MD5 d7fdcf125edec719a382dcddfe962a7f
BLAKE2b-256 201c63a468a4cfcce30c3908de88d8493359f8588bfb602668348b98e7e5e638

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page