Django clickHouse database backend.
Project description
Django ClickHouse Database Backend
Django clickhouse backend is a django database backend for clickhouse database. This project allows using django ORM to interact with clickhouse.
Thanks to clickhouse driver, django clickhouse backend use it as DBAPI. Thanks to clickhouse pool, it makes clickhouse connection pool.
features:
- Support Clickhouse native interface and connection pool.
- Define clickhouse specific schema features such as Engine and Index in django ORM.
- Support table migrations.
- Support creating test database and table, working with django TestCase and pytest-django.
- Support most types of query and data types, full feature is under developing.
- Support SETTINGS in SELECT Query.
Get started
Installation
pip install django-clickhouse-backend
or
git clone https://github.com/jayvynl/django-clickhouse-backend
cd django-clickhouse-backend
python setup.py install
Configuration
Only ENGINE is required, other options have default values.
-
ENGINE: required, set to
clickhouse_backend.backend. -
NAME: database name, default
default. -
HOST: database host, default
localhost. -
PORT: database port, default
9000. -
USER: database user, default
default. -
PASSWORD: database password, default empty.
DATABASES = { 'default': { 'ENGINE': 'clickhouse_backend.backend', 'NAME': 'default', 'HOST': 'localhost', 'USER': 'DB_USER', 'PASSWORD': 'DB_PASSWORD', 'TEST': { 'fake_transaction': True } } } DEFAULT_AUTO_FIELD = 'django.db.models.BigAutoField'
DEFAULT_AUTO_FIELD = 'django.db.models.BigAutoField' IS REQUIRED TO WORKING WITH DJANGO MIGRATION.
More details will be covered in [Primary key](#Primary key).
Model
from django.db import models
from django.utils import timezone
from clickhouse_backend import models as chm
from clickhouse_backend.models import indexes, engines
class Event(chm.ClickhouseModel):
src_ip = chm.GenericIPAddressField(default='::')
sport = chm.PositiveSmallIntegerField(default=0)
dst_ip = chm.GenericIPAddressField(default='::')
dport = chm.PositiveSmallIntegerField(default=0)
transport = models.CharField(max_length=3, default='')
protocol = models.TextField(default='')
content = models.TextField(default='')
timestamp = models.DateTimeField(default=timezone.now)
created_at = models.DateTimeField(auto_now_add=True)
length = chm.PositiveIntegerField(default=0)
count = chm.PositiveIntegerField(default=1)
class Meta:
verbose_name = 'Network event'
ordering = ['-id']
db_table = 'event'
engine = engines.ReplacingMergeTree(
order_by=('dst_ip', 'timestamp'),
partition_by=models.Func('timestamp', function='toYYYYMMDD')
)
indexes = [
indexes.Index(
fields=('src_ip', 'dst_ip'),
type=indexes.Set(1000),
granularity=4
)
]
constraints = (
models.CheckConstraint(
name='sport_range',
check=models.Q(sport__gte=0, dport__lte=65535),
),
)
Migration
python manage.py makemigrations
Testing
Writing testcase is all the same as normal django project. You can use django TestCase or pytest-django. Notice: clickhouse use mutations for deleting or updating. By default, data mutations is processed asynchronously, so you should change this default behavior in testing for deleting or updating. There are 2 ways to do that:
- Config database engine as follows, this sets
mutations_sync=1at session scope.DATABASES = { 'default': { 'ENGINE': 'clickhouse_backend.backend', 'OPTIONS': { 'settings': { 'mutations_sync': 1, } } } }
- Use SETTINGS in SELECT Query.
Event.objects.filter(transport='UDP').settings(mutations_sync=1).delete()
Sample test case.
from django.test import TestCase
class TestEvent(TestCase):
def test_spam(self):
assert Event.objects.count() == 0
Topics
Primary key
Django ORM depends heavily on single column primary key, this primary key is a unique identifier of an ORM object.
All get save delete actions depend on primary key.
But in ClickHouse primary key has different meaning with django primary key. ClickHouse does not require a unique primary key. You can insert multiple rows with the same primary key.
There is no unique constraint or auto increasing column in clickhouse.
By default, django will add a field named id as auto increasing primary key.
-
AutoField
Mapped to clickhouse Int32 data type. You should generate this unique id yourself
-
BigAutoField
Mapped to clickhouse Int64 data type. If primary key is not specified when insert data, then
clickhouse_driver.idworker.id_workeris used to generate this unique key.Default id_worker is an instance of
clickhouse.idworker.snowflake.SnowflakeIDWorkerwhich implement twitter snowflake id. If data insertions happen on multiple datacenter, server, process or thread, you should ensure uniqueness of (CLICKHOUSE_WORKER_ID, CLICKHOUSE_DATACENTER_ID) environment variable. Because work_id and datacenter_id are 5 bits, they should be an integer between 0 and 31. CLICKHOUSE_WORKER_ID default to 0, CLICKHOUSE_DATACENTER_ID will be generated randomly if not provided.clickhouse.idworker.snowflake.SnowflakeIDWorkeris not thread safe. You could inheritclickhouse.idworker.base.BaseIDWorkerand implement one, and setCLICKHOUSE_ID_WORKERto doted import path of your IDWorker instance.
Django use a table named django_migrations to track migration files. ID field should be BigAutoField, so that IDWorker can generate unique id for you.
After Django 3.2,a new config DEFAULT_AUTO_FIELD is introduced to control field type of default primary key.
So DEFAULT_AUTO_FIELD = 'django.db.models.BigAutoField' is required if you want to use migrations with django clickhouse backend.
Fields
Nullable
null=True will make Nullable type in clickhouse database.
Note Using Nullable almost always negatively affects performance, keep this in mind when designing your databases.
GenericIPAddressField
Clickhouse backend has its own implementation in clickhouse_backend.models.fields.GenericIPAddressField.
If protocol='ipv4', a column of IPv4 is generated, else IPv6 is generated.
PositiveSmallIntegerField
PositiveIntegerField
PositiveBigIntegerField
clickhouse_backend.models.fields.PositiveSmallIntegerField maps to UInt16.
clickhouse_backend.models.fields.PositiveIntegerField maps to UInt32.
clickhouse_backend.models.fields.PositiveBigIntegerField maps to UInt64.
Clickhouse have unsigned integer type, these fields will have right integer range validators.
Engines
Lays in clickhouse_backend.models.engines.
Indexes
Lays in clickouse_backend.models.indexes.
Test
To run test for this project:
git clone https://github.com/jayvynl/django-clickhouse-backend
cd django-clickhouse-backend
# docker and docker-compose are required.
docker-compose up -d
python tests/runtests.py
Note This project is not fully tested yet and should be used with caution in production.
License
Django clickhouse backend is distributed under the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file django-clickhouse-backend-0.2.1.tar.gz.
File metadata
- Download URL: django-clickhouse-backend-0.2.1.tar.gz
- Upload date:
- Size: 33.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5afc7d7e96b4d3c8f7b226c1d0714243da0cc9178296114106b824aec9c3e519
|
|
| MD5 |
6a79164c15ffaef1b9138c6e8eb18592
|
|
| BLAKE2b-256 |
7651e7feafc74e6b6c12762404c22d828220cd0d3fad157fecd12a4820ac131b
|
File details
Details for the file django_clickhouse_backend-0.2.1-py3-none-any.whl.
File metadata
- Download URL: django_clickhouse_backend-0.2.1-py3-none-any.whl
- Upload date:
- Size: 38.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9f679fcc87aeadda8e6fb1b2699f8af818d7ceb6d173eb50a33859824d47f9d3
|
|
| MD5 |
93823e5974151a55880793dfda67fbc9
|
|
| BLAKE2b-256 |
03bc901c4675c658ff8b8edffc59deff7007698148fcb7e94970618722167756
|