Python driver with native interface for ByteHouse
Project description
ByteHouse Python Driver
Introduction
ByteHouse provides a Python driver that supports Python Database API Specification v2.0. The driver can be used with most client tools/applications/BI tools which accept python driver following python DB API 2.0. The driver uses TCP/Native protocol to connect to ByteHouse.
Requirements
Python v3.6 or higher
Installation from PyPI
Latest release version can be installed from here:
pip install bytehouse-driver
Installation from github
Current development version can be installed from here:
pip install git+https://github.com/bytehouse-cloud/driver-py@master#egg=bytehouse-driver
Creating ByteHouse Account
You need to create ByteHouse account in order to use Python Driver. You can simply create a free account with the
process mentioned in our official website documentation: https://docs.bytehouse.cloud/en/docs/quick-start
You can also create ByteHouse account through Volcano Engine by ByteDance: https://www.volcengine.com/product/bytehouse-cloud
ByteHouse Regions
Currently, the driver supports the following region names across different cloud providers. Alternatively, if you know the host address of ByteHouse server, you can directly use host address & omit region name.
Region Name | Target Server |
AP-SOUTHEAST-1 | gateway.aws-ap-southeast-1.bytehouse.cloud:19000 |
VOLCANO-CN-NORTH-1 | bytehouse-cn-beijing.volces.com:19000 |
URI format for Connection & Authentication
Region & Password Format
Required parameters: region
account
user
password
'bytehouse:///?region={}&account={}&user={}&password={}'.format(REGION, ACCOUNT, USER, PASSWORD)
Host Address & Password Format
Required parameters: host
port
account
user
password
'bytehouse://{}:{}/?account={}&user={}&password={}'.format(HOST, PORT, ACCOUNT, USER, PASSWORD)
For API Key authentication, user is always 'bytehouse'
Region & API Key Format
Required parameters: region
password
'bytehouse:///?region={}&user=bytehouse&password={}'.format(REGION, API_KEY)
Host Address & API Key Format
Required parameters: host
port
password
'bytehouse://{}:{}/?user=bytehouse&password={}'.format(HOST, PORT, API_KEY)
Virtual warehouse & Role Management
Connection initialiaztion with ByteHouse always assumes default virtual warehouse & active role, therefore these values cannot be empty. So before using the driver, users need to set/ensure these values through https://console.bytehouse.cloud/account/details
Constructing Client Object
Passing parameters
from bytehouse_driver import Client
client = Client(
region=REGION,
account=ACCOUNT,
user=USER,
password=PASSWORD
)
From URI
from bytehouse_driver import Client
client = Client.from_url('bytehouse:///?region={}&account={}&user={}&password{}'.format(
REGION, ACCOUNT, USER, PASSWORD)
)
Performing SQL queries
from bytehouse_driver import Client
client = Client(
region=REGION,
account=ACCOUNT,
user=USER,
password=PASSWORD
)
# DDL Query
client.execute("CREATE DATABASE demo_db")
client.execute("CREATE TABLE demo_db.demo_tb (id INT) ENGINE=CnchMergeTree() ORDER BY tuple()")
# DML Query
client.execute("INSERT INTO demo_db.demo_tb VALUES", [[1], [2], [3]])
# DQL Query
result_set = client.execute("SELECT * FROM demo_db.demo_tb")
for result in result_set:
print(result)
client.execute("DROP DATABASE demo_db")
Supported Datatypes
ByteHouse type | Python type for INSERT | Python type for SELECT |
---|---|---|
Integar family (UInt8/UInt16/UInt32/UInt64 / Int8/Int16/Int32/Int64) | int long |
int |
Float family (Float32/Float64) | float int long |
float |
String | str bytes |
str bytes |
FixedString | str bytes |
str bytes |
Nullable | None T |
None T |
Date | date datetime |
date |
DateTime | datetime int long |
datetime |
Array | list tuple |
list |
Enum family | Enum int long str |
str |
Decimal | Decimal float int long |
Decimal |
IP family | IPv4Address IPv6Address int long str |
IPv4Address IPv6Address |
Map | dict |
dict |
LowCardinality | T |
T |
UUID | UUID str |
UUID |
Settings types_check=True
Default value for 'types_check' is false for performance. If set to true, then explicit type checking and transformation would happen before passing the data onto the server. Recommended to set it to true, for float/decimal or any other types, where raw data needs to be transformed into appropriate type.
Integer family
Int8
Int16
Int32
Int64
UInt8
UInt16
UInt32
UInt64
client.execute("CREATE TABLE demo_db.demo_tb (a Int8, b Int16, c Int32, d Int64, e UInt8, f UInt16, g UInt32, h UInt64) ENGINE=CnchMergeTree() ORDER BY tuple()")
data = [
(-10, -300, -123581321, -123581321345589144, 10, 300, 123581321, 123581321345589144)
]
client.execute("INSERT INTO demo_db.demo_tb VALUES", data)
Float family
Float32
Float64
client.execute("CREATE TABLE demo_db.demo_tb (a Float32, b Float64) ENGINE=CnchMergeTree() ORDER BY tuple()")
data = [
(3.4028235e38, 3.4028235e38),
(3.4028235e39, 3.4028235e39),
(-3.4028235e39, 3.4028235e39),
(1, 2)
]
client.execute("INSERT INTO demo_db.demo_tb VALUES", data, types_check=True)
String
client.execute("CREATE TABLE demo_db.demo_tb (a String) ENGINE=CnchMergeTree() ORDER BY tuple()")
data = [('axdfgrt', )]
client.execute("INSERT INTO demo_db.demo_tb VALUES", data)
FixedString
client.execute("CREATE TABLE demo_db.demo_tb (a FixedString(4)) ENGINE=CnchMergeTree() ORDER BY tuple()")
data = [('a', ), ('bb', ), ('ccc', ), ('dddd', ), ('я', )]
client.execute("INSERT INTO demo_db.demo_tb VALUES", data)
Nullable
client.execute("CREATE TABLE demo_db.demo_tb (a Nullable(Int32)) ENGINE=CnchMergeTree() ORDER BY tuple()")
data = [(3, ), (None, ), (2, )]
client.execute("INSERT INTO demo_db.demo_tb VALUES", data)
Date
from datetime import date, datetime
client.execute("CREATE TABLE demo_db.demo_tb (a Date) ENGINE=CnchMergeTree() ORDER BY tuple()")
data = [(date(1970, 1, 1), ), (datetime(2015, 6, 6, 12, 30, 54), )]
client.execute("INSERT INTO demo_db.demo_tb VALUES", data)
DateTime
from datetime import datetime
client.execute("CREATE TABLE demo_db.demo_tb (a DateTime) ENGINE=CnchMergeTree() ORDER BY tuple()")
data = [(datetime(2015, 6, 6, 12, 30, 54), ), (1530211034,)]
client.execute("INSERT INTO demo_db.demo_tb VALUES", data)
Array
client.execute("CREATE TABLE demo_db.demo_tb (a Array(Int32)) ENGINE=CnchMergeTree() ORDER BY tuple()")
data = [([], ), ([100, 500], )]
client.execute("INSERT INTO demo_db.demo_tb VALUES", data)
Enum family
Enum8
Enum16
from enum import IntEnum
class A(IntEnum):
hello = -1
world = 2
class B(IntEnum):
foo = -300
bar = 300
client.execute("CREATE TABLE demo_db.demo_tb (a Enum8('hello' = -1, 'world' = 2), b Enum16('foo' = -300, 'bar' = 300)) ENGINE=CnchMergeTree() ORDER BY tuple()")
data = [(A.hello, B.bar), (A.world, B.foo), (-1, 300), (2, -300)]
client.execute("INSERT INTO demo_db.demo_tb VALUES", data)
Decimal
from decimal import Decimal
client.execute("CREATE TABLE demo_db.demo_tb (a Decimal(9, 5)) ENGINE=CnchMergeTree() ORDER BY tuple()")
data = [(Decimal('300.42'),), (300.42,), (-300,)]
client.execute("INSERT INTO demo_db.demo_tb VALUES", data, types_check=True)
IP family
IPv4
IPv6
from ipaddress import IPv6Address, IPv4Address
client.execute("CREATE TABLE demo_db.demo_tb (a IPv4, b IPv6) ENGINE=CnchMergeTree() ORDER BY tuple()")
data = [
(IPv4Address("10.0.0.1"), IPv6Address('79f4:e698:45de:a59b:2765:28e3:8d3a:35ae'),),
]
client.execute("INSERT INTO demo_db.demo_tb (a, b) VALUES", data)
Map
client.execute("CREATE TABLE demo_db.demo_tb (a Map(String, UInt64)) ENGINE=CnchMergeTree() ORDER BY tuple()")
data = [
({},),
({'key1': 1},),
({'key1': 2, 'key2': 20},),
({'key1': 3, 'key2': 30, 'key3': 50},)
]
client.execute("INSERT INTO demo_db.demo_tb VALUES", data)
LowCardinality
client.execute("CREATE TABLE demo_db.demo_tb (a LowCardinality(UInt8)) ENGINE=CnchMergeTree() ORDER BY tuple()")
data = [(x,) for x in range(255)]
client.execute("INSERT INTO demo_db.demo_tb VALUES", data)
UUID
from uuid import UUID
client.execute("CREATE TABLE demo_db.demo_tb (a UUID) ENGINE=CnchMergeTree() ORDER BY tuple()")
data = [
(UUID('c0fcbba9-0752-44ed-a5d6-4dfb4342b89d'),),
('2efcead4-ff55-4db5-bdb4-6b36a308d8e0',)
]
client.execute("INSERT INTO demo_db.demo_tb VALUES", data)
Cursor Support: DB API 2.0
Cursors are supported following DB API 2.0 specifications. Cursors are created by the connection.cursor() method. They are bound to the connection for the entire lifetime and all the commands are executed in the context of the database session wrapped by the connection.
from bytehouse_driver import connect
kwargs = {}
kwargs.setdefault('region', REGION)
kwargs.setdefault('account', ACCOUNT)
kwargs.setdefault('user', USER)
kwargs.setdefault('password', PASSWORD)
connection = connect(**kwargs)
cursor = connection.cursor()
cursor.execute("DROP TABLE IF EXISTS cursor_tb")
cursor.execute("CREATE TABLE cursor_tb (id INT) ENGINE=CnchMergeTree() ORDER BY tuple()")
cursor.executemany("INSERT INTO cursor_tb (id) VALUES", [{'id': 100}])
result_set = cursor.execute("SELECT * FROM cursor_tb")
for result in result_set:
print(result)
connection.close()
User defined query-id
User can manually supply query-id for each query execution. Users are encouraged to maintain uniqueness or relevancy of the query-id string. If not set, then server will assign a randomly generated UUID as the query-id.
client = Client(
region=self.region,
account=self.account,
user=self.user,
password=self.password
)
client.execute("SELECT 1", query_id="ba2e2cea-2a11-4926-a0b8-e694ded0cf65")
Local Development
Change setup.cfg
file to include your connection credentials. For running tests locally, follow these steps:
python testsrequire.py && python setup.py develop
py.test -v
Issue Reporting
If you have found a bug or if you have a feature request, please report them at this repository issues section. Alternatively, you can directly create an issue with our support platform here: https://bytehouse.cloud/support
Original Author
ByteHouse wants to thank original author @Konstantin Lebedev & ClickHouse for original contribution to this driver.
License
This project is distributed under the terms of the MIT license: http://www.opensource.org/licenses/mit-license.php
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file bytehouse_driver-1.0.4.tar.gz
.
File metadata
- Download URL: bytehouse_driver-1.0.4.tar.gz
- Upload date:
- Size: 254.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d65e075077c6f5b7084dc9aced6b2d1baa80771efc110f544d38ac9431f22d47 |
|
MD5 | 2c155c16b960f7ea2f8d0c8fc5dece84 |
|
BLAKE2b-256 | c5bcf5f090ed1a088ac8f6fbaa4066282230c6239ee181328bb421d34cfbc2de |
File details
Details for the file bytehouse_driver-1.0.4-cp310-cp310-macosx_12_0_x86_64.whl
.
File metadata
- Download URL: bytehouse_driver-1.0.4-cp310-cp310-macosx_12_0_x86_64.whl
- Upload date:
- Size: 236.9 kB
- Tags: CPython 3.10, macOS 12.0+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e9e67edbaed5dcfdcaa3c17b4cc67836723e76989b5f788680859b3c9e46cd6 |
|
MD5 | 3a366dfc985ca20eb7981ae19dd7fb4b |
|
BLAKE2b-256 | 31d2f576551b398a7ea1cb2b1fafa402406167977decf5db0c1cfba268336968 |