Skip to main content

Library for read and write clickhouse native format.

Project description

NativeLib

Library for working with Clickhouse Native Format

Description of the format on the official website:

The most efficient format. Data is written and read by blocks in binary format.
For each block, the number of rows, number of columns, column names and types,
and parts of columns in this block are recorded one after another. In other words,
this format is “columnar” – it does not convert columns to rows.
This is the format used in the native interface for interaction between servers,
for using the command-line client, and for C++ clients.

You can use this format to quickly generate dumps that can only be read by the ClickHouse DBMS.
It does not make sense to work with this format yourself.

This library allows for data exchange between Clickhouse Native Format and python/pandas.DataFrame/polars.DataFrame.

Unsupported data types (at the moment)

  • Time
  • Time64
  • Tuple # Tuple(T1, T2, ...).
  • Map # Map(K, V).
  • Variant # Variant(T1, T2, ...).
  • AggregateFunction # (name, types_of_arguments...) — parametric data type.
  • SimpleAggregateFunction # (name, types_of_arguments...) data type stores current value (intermediate state) of the aggregate function.
  • Point # stored as a Tuple(Float64, Float64).
  • Ring # stored as an array of points: Array(Point).
  • LineString # stored as an array of points: Array(Point).
  • MultiLineString # is multiple lines stored as an array of LineString: Array(LineString).
  • Polygon # stored as an array of rings: Array(Ring).
  • MultiPolygon # stored as an array of polygons: Array(Polygon).
  • Expression # used for representing lambdas in high-order functions.
  • Set # Used for the right half of an IN expression.
  • Domains # You can use domains anywhere corresponding base type can be used.
  • Nested # Nested(name1 Type1, Name2 Type2, ...).
  • Dynamic # This type allows to store values of any type inside it without knowing all of them in advance.
  • JSON # Stores JavaScript Object Notation (JSON) documents in a single column.

Supported data types

Clickhouse data type Read Write Python data type (Read/Write)
UInt8 + + int
UInt16 + + int
UInt32 + + int
UInt64 + + int
UInt128 + + int
UInt256 + + int
Int8 + + int
Int16 + + int
Int32 + + int
Int64 + + int
Int128 + + int
Int256 + + int
Float32 + + float
Float64 + + float
BFloat16 + + float
Decimal(P, S) + + decimal.Decimal
String + + str
FixedString(N) + + str
Date + + datetime.date
Date32 + + datetime.date
DateTime + + datetime.datetime
DateTime64 + + datetime.datetime
Enum + + str/Union[int, enum.Enum, str]
Bool + + bool
UUID + + uuid.UUID
IPv4 + + ipaddress.IPv4Address
IPv6 + + ipaddress.IPv6Address
Array(T) + + list[T*]
LowCardinality(T) + + Union[str,date,datetime,int,float]
Nullable(T) + + Optional[T*]
Nothing + + None

*T - any simple data type from those listed in the table

Installation

From pip

pip install nativelib

From local directory

pip install .

From git

pip install git+https://github.com/0xMihalich/nativelib

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nativelib-0.2.0.7.tar.gz (22.1 kB view details)

Uploaded Source

File details

Details for the file nativelib-0.2.0.7.tar.gz.

File metadata

  • Download URL: nativelib-0.2.0.7.tar.gz
  • Upload date:
  • Size: 22.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for nativelib-0.2.0.7.tar.gz
Algorithm Hash digest
SHA256 a03858a73f6d9aaafb6ab00a7cb239e37b1d4efcbe0d43d4692f89f2f7c4fa8e
MD5 eec88151d5bc1b1e4e4170fb1b0ef812
BLAKE2b-256 1054c2be549baf4608f2b8d7865cbc3b870eb3f457351ab05c84d44e9b319556

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page