Skip to main content

Read and write pandas dataframes to hbase.

Project description

======================
Pandas HBase IO Helper
======================

Persist pandas DataFrame objects to HBase and read them back later.

Known Issues:
- Works only with DataFrames that have integer indices.
- DataFrames to be persisted should not have ':' in column names

Writing DataFrame to HBase
--------------------------


Establish hbase connection using happybase and write the dataframe.

.. code-block:: python
import happybase
import numpy as np
import pandas as pd
import pdhbase as pdh
connection = None
try:
connection = happybase.Connection('127.0.0.1')
connection.open()
df = pd.DataFrame(np.random.randn(10, 5), columns=['a', 'b', 'c', 'd', 'e'])
df['f'] = 'hello world'
pdh.to_hbase(df, connection, 'sample_table', 'df_key', cf='cf')
finally:
if connection:
connection.close()


Reading DataFrame from HBase
----------------------------


Establish hbase connection using happybase and read the dataframe.

.. code-block:: python
import happybase
import numpy as np
import pandas as pd
import pdhbase as pdh
connection = None
try:
connection = happybase.Connection('127.0.0.1')
connection.open()
df = read_hbase(connection, 'sample_table', 'df_key', cf='cf')
print df
finally:
if connection:
connection.close()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdhbase-0.1.2-rc2.tar.gz (2.2 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page