Skip to main content

Async http clickhouse client for python 3.6+

Project description

aiochclient

Async http(s) clickhouse client for python 3.6+ with types converting in both directions, streaming support, lazy decoding on select queries and fully typed interface

PyPI version Travis CI Documentation Status codecov Code style: black

Contents

Install

> pip install aiochclient

While installing it will try to build C extensions speed boost (about 30% speed up).

Quick start

Connecting to Clickhouse

aiochclient needs aiohttp.ClientSession for connecting:

from aiochclient import ChClient
from aiohttp import ClientSession


async def main():
    async with ClientSession() as s:
        client = ChClient(s)
        assert await client.is_alive()  # returns True if connection is Ok

Making queries

await client.execute(
    "CREATE TABLE t (a UInt8, b Tuple(Date, Nullable(Float32))) ENGINE = Memory"
)

For INSERT queries you can pass values as *args. Values should be iterables:

await client.execute(
    "INSERT INTO t VALUES",
    (1, (dt.date(2018, 9, 7), None)),
    (2, (dt.date(2018, 9, 8), 3.14)),
)

For fetching all rows at once use fetch method:

all_rows = await client.fetch("SELECT * FROM t")

For fetching first row from result use fetchrow method:

row = await client.fetchrow("SELECT * FROM t WHERE a=1")

assert row[0] == 1
assert row["b"] == (dt.date(2018, 9, 7), None)

You can also use fetchval method, which returns first value of the first row from query result:

val = await client.fetchval("SELECT b FROM t WHERE a=2")

assert val == (dt.date(2018, 9, 8), 3.14)

With async iteration on query results steam you can fetch multiple rows without loading them all into memory at once:

async for row in client.iterate(
    "SELECT number, number*2 FROM system.numbers LIMIT 10000"
):
    assert row[0] * 2 == row[1]

Use fetch/fetchrow/fetchval/iterate for SELECT queries and execute or any of last for INSERT and all another queries.

Working with query results

All fetch queries return rows as lightweight, memory efficient objects (from v1.0.0, before it - just tuples) with full mapping interface, where you can get fields by names or by indexes:

row = await client.fetchrow("SELECT a, b FROM t WHERE a=1")

assert row["a"] == 1
assert row[0] == 1
assert row[:] == (1, (dt.date(2018, 9, 8), 3.14))
assert list(row.keys()) == ["a", "b"]
assert list(row.values()) == [1, (dt.date(2018, 9, 8), 3.14)]

Types converting

aiochclient automatically converts values to needed type both from Clickhouse response and for client INSERT queries.

Clickhouse type Python type
UInt8 int
UInt16 int
UInt32 int
UInt64 int
Int8 int
Int16 int
Int32 int
Int64 int
Float32 float
Float64 float
String str
FixedString str
Enum8 str
Enum16 str
Date datetime.date
DateTime datetime.datetime
Tuple(T1, T2, ...) Tuple[T1, T2, ...]
Array(T) List[T]
UUID uuid.UUID
Nullable(T) None or T
Nothing None
LowCardinality(T) T

Connection pool

If you want to change connection pool size, you can use aiohttp.TCPConnector. Note that by default pool limit is 100 connections.

Speed

Using of uvloop, cChardet and aiodns libraries are recommended for sake of speed.

As for the last version of aiochclient its speed using one task (without gather or parallel clients and so on) is about 180k-220k rows/sec on SELECT and about 50k-80k rows/sec on INSERT queries depending on its environment and clickhouse settings.


Please ⭐️ this repository if this project helped you!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiochclient-1.1.0rc0.tar.gz (161.3 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page