Skip to main content

Package to pipe data from HBase to Spark (2+)

Project description

HBSpark

This project package is meant to be an interface between Hbase and spark that moves information directly from the thrift api to spark rdd.

NOTE: THIS PACKAGE IS UNDER HEAVY DEVELOPEMENT AND IS NOT MATURE IN ANY MEANS. BUGS AND CHANGES TO THE API SHOULD BE EXPECTED.

Developement environment:

The current developement environment is as follows:

  • python 3.6.9
  • happybase 1.2.0
  • pyspark 3.2.0

The target development environment is as follows:

  • python 2.7.5
  • happybase 1.2.0
  • (spark) 2.2.0.cloudera1

Currently, dependency requirements through the package may be inconsistent. If issues persist, please emulate the developement environment provided above.

Packaging:

This package has been created following the https://packaging.python.org/tutorials/packaging-projects/ tutorial.

  • pyproject.toml:
    • Determines dependencies for PIP
  • setup.cfg:
    • Static configuration for setuptools (packagemanagement)

Installation:

In order to install the package, pip can be used:

pip install hbspark

Documentation

And for usage documentation, please refer to the readthedocs page which includes an in depth API.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hbspark-0.0.4.tar.gz (10.5 kB view details)

Uploaded Source

Built Distribution

hbspark-0.0.4-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file hbspark-0.0.4.tar.gz.

File metadata

  • Download URL: hbspark-0.0.4.tar.gz
  • Upload date:
  • Size: 10.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.0 importlib_metadata/3.7.3 pkginfo/1.8.2 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.6.9

File hashes

Hashes for hbspark-0.0.4.tar.gz
Algorithm Hash digest
SHA256 805061fcb19a7af5aa7b15812faa0d82552ebc9ba47ec1dd106792c7e66db6f4
MD5 d132a0aaab56753f7919a900de9324aa
BLAKE2b-256 9d7f024a20ab21cdb66b5853e625dda17c07ab37edd9824612437051ba66acdf

See more details on using hashes here.

File details

Details for the file hbspark-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: hbspark-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 8.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.0 importlib_metadata/3.7.3 pkginfo/1.8.2 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.6.9

File hashes

Hashes for hbspark-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 d103024d80ffeeac42a0d382766906edd1262c460fe261c6778e63c49119fce5
MD5 6a56e8c0ef785c7d5fcbae430ef918ca
BLAKE2b-256 445c3f254769d578a4f5a7bb8d8836b6794030f97138084ef78d94e1093572a7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page