Skip to main content

Select ClickHouse data, convert to pandas dataframes

Project description

clickhouse2pandas

Select ClickHouse data, convert to pandas dataframes and various other formats, by using the ClickHouse HTTP interface.

Features

  • The transmitting data is compressed by default, which reduces network traffic and thus reduces the time for downloading data.
  • Comes with a dynamic download label, which shows how many data is downloaded.
  • Converts the ClickHouse query result into proper pandas data types, e.g., ClickHouse DateTime -> pandas datetime64.
  • Minimum dependencies, 5 standard python libraries (urllib, http, gzip, json, time) and 1 external library (pandas).

Installation

pip install clickhouse2pandas

Usage

import clickhouse2pandas as ch2pd

connection_url = 'http://user:password@clickhouse_host:8123'

query = 'select * from system.numbers limit 1000000'

df = ch2pd.select(connection_url, query)
# df is a pandas dataframe converted from ClickHouse query result

API Reference

clickhouse2pandas.select(connection_url, query = None, convert_to = 'DataFrame', settings = None)

Return a formatted query result specified by "convert_to" parameter.

Parameters:

  • connection_url: the connection url to the ClickHouse HTTP interface, e.g., http://user:password@clickhouse_host:8123
  • query: the SQL query, the query should start with 'select'
  • convert_to: convert the query result into specific format, could be one of the following: 'DataFrame', 'TabSeparated', 'TabSeparatedRaw', 'TabSeparatedWithNames', 'TabSeparatedWithNamesAndTypes', 'CSV', 'CSVWithNames', 'Values', 'Vertical', 'JSON', 'JSONCompact', 'JSONEachRow', 'TSKV', 'Pretty', 'PrettyCompact', 'PrettyCompactMonoBlock', 'PrettyNoEscapes', 'PrettySpace', 'XML'. Refer to ClickHouse Input and Output Formats
  • settings: a dict containing the setting key-values, default settings are {'enable_http_compression': 1, 'send_progress_in_http_headers': 0,'log_queries': 1, 'connect_timeout': 10, 'receive_timeout': 300, 'send_timeout': 300, 'output_format_json_quote_64bit_integers': 0, 'wait_end_of_query': 0}. Refer to ClickHouse Settings

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clickhouse2pandas-0.0.3.tar.gz (3.7 kB view hashes)

Uploaded Source

Built Distribution

clickhouse2pandas-0.0.3-py3-none-any.whl (8.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page