Skip to main content

Data mining Group hbase utils

Project description

hbase连接

基于happybase封装简单的hbase使用库,支持连接、创建、查询、插入功能

项目结构

  • hbase
    • LICENSE.md
    • README.md
    • setup.py
    • src
      • __init__.py
      • conf_reader.py
      • hbase_client.py
      • reconection.py

使用方法

安装/更新

# 安装
pip install --index-url http://192.168.38.1:31410/bmai/pypi --trusted-host 192.168.38.1 bmai-dm-hbase

# 更新
pip install --index-url http://192.168.38.1:31410/bmai/pypi --trusted-host 192.168.38.1 bmai-dm-hbase --upgrade

配置文件模板

hbase节点信息

可配可不配,不配置的话直接使用hbase_client.py内的类,自行传入相关信息即可

pro: # 正式环境ec, thrift
  host: 192.168.xx.xx # 失效
  port: 9090

dev: # 开发环境, thrift
  host: 192.168.xx.xx
  host_name: xx-xx-xx
  port: 9090

示例

import pandas as pd
from dm_hbase.hbase_client import HBaseClient

connection = HBaseClient(
    host='xxx.xxx.xx.xx',
    port=9090,
    env='test'    # 2022-11-15新增, env参数支持'test'或者'prd', 默认为'test'
)
connection.build_pool()
# 查看当前库中所有表
connection.show_tables()
# 查表
connection.scan_tables(
    table_name='xxx:xxxx',
    limit=10
)
# 查列族
connection.get_families('xxx:xxxx')
# 查分区
connection.get_regions('xxx:xxx')
# 插入数据
connection.insert(
    table_name='xxx:xxx',
    datas={row_key: {'column_family:feature': value}}
    batch_size=1000
)
# 以dataframe形式插入数据
df = pd.read_csv('xxx.csv')
connection.insert_df(
    table_name='xxx:xxx',
    df=df,
    rowkeys_col='xxx',
    batch_size=1000
)

开发日志

2022-4-21

  1. 打包发至私有pypi

2022-4-22

  1. 修复安装后无法使用的bug

2022-5-5

  1. 构造函数增加配置文件路径参数并修改相关内容
  2. 构造函数逻辑优化
  3. 增加简单测试用例
  4. 优化代码格式

2022-5-13

  1. 构造函数优化,默认port为9090,新增配置文件警告

2022-9-8

  1. 调整连接池默认参数,适配hbase 2.0版本连接

2022-10-10

  1. 优化insert函数,新增bytes类型判断与转换
  2. 优化insert_df函数

2022-11-15

  1. 优化__init__, 加入环境判断

2022-11-17

  1. 调整thrift, thrift-sasl版本依赖

2023-02-11

  1. 调整prd参数

2023-02-13

  1. 去除thriftpy依赖

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bmai-dm-hbase-0.2.2.linux-x86_64.tar.gz (10.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bmai_dm_hbase-0.2.2-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file bmai-dm-hbase-0.2.2.linux-x86_64.tar.gz.

File metadata

File hashes

Hashes for bmai-dm-hbase-0.2.2.linux-x86_64.tar.gz
Algorithm Hash digest
SHA256 5dedda2efbbd2eee38e84ba8822f6a1d659f6d9a564cd9bb8e8938790f6ee36d
MD5 f942f43dd295564a9b86c8f3c6d8442f
BLAKE2b-256 3a718ca8d87ab9e17a3d7527a80b99a2d38c17895e294e2ea5afb5584c10f3ab

See more details on using hashes here.

File details

Details for the file bmai_dm_hbase-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: bmai_dm_hbase-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 8.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.7.12

File hashes

Hashes for bmai_dm_hbase-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 398be9bc680a597f8a074adc7e550bcafd5bc7aeb89527e8feec3c891efed0ea
MD5 edba000e0a5b290b10c26a09ee34f620
BLAKE2b-256 d62de329c21472de8abd4fbe969dbe3a9782a83671529082f4e46c5aed7a2ae7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page