Stream data to Apache Kafka.
Project description
django-kafka-streamer is a Django application and library for streaming data to Apache Kafka.
Features:
Setup signal handlers to ORM models to transparently send create/update/delete events to Kafka
Handle database object relations
Celery task to stream large amount of data in background
Links:
Documentation: http://django-kafka-streamer.readthedocs.io/
Consumer library: https://github.com/lostclus/aiosafeconsumer
Example application: https://github.com/lostclus/WeatherApp
Usage:
yourapp/models.py:
from django.db import models class MyModel(models.Model): field1 = models.IntegerField() field2 = models.CharField(max_length=10)
yourapp/stramers.py:
from kafkastreamer import Streamer, register from .models import MyModel @register(MyModel) class MyModelStreamer(Streamer): topic = "model-a"
yourproject/settings.py:
INSTALLED_APPS = [ ... "kafkastreamer", ] KAFKA_STREAMER = { "BOOTSTRAP_SERVERS": ["localhost:9092"], },
Any changes in MyModel data will be automatically streamed to Kafka. To force stream all data in all registered models type:
python manage.py kafkastreamer_refresh
The data streamed to the model-a Kafka topic has following structure:
{ "_time": "2023-01-01T00:00:00Z", "_type": "create", "id": 1, "field1": 1, "field2": "abc" }