Skip to main content

Data orchestration for Django

Project description

Django KCK

Django KCK is data orchestration for Django. It can be used for:

  • scheduled data imports from remote sources
  • ensuring each data product kept fresh, either by updating at a regular interval or when there is a change in source data on upon which it depends
  • preparing complex data products in advance of a likely request
  • simplifying and optimizing complex data flows

The development pattern Django KCK encourages for data products emphasizes compartmentalization and simplification over complexity, cached data with configurable refresh routines over real-time computation, and common-sense optimizations over sprawling distributed parallelism.

History

Django KCK is a simplified version of KCK that targets the Django environment exclusively. It also uses PostgreSQL as the cache backend, instead of Cassandra.

Quick Install

Basic Usage

# myapp/primers.py

from kck import Primer


class TitleListPrimer(Primer):
    key = 'title_list'
    parameters = [
        {"name": "id", "from_str": int}
    ]

    def compute(self, key):
        param_dict = self.key_to_param_dict(key)
        results = [{ 'title': lkp_title(id) } for id in param_dict['id_list']]
        return results
# myapp/views.py

from kck import Cache
from django.http import JsonResponse

def first_data_product_view(request, author_id):
    cache = Cache.get_instance()
    title_list = cache.get(f'title_list/{author_id}')
    return JsonResponse(title_list)

Theory

Essentially, Django KCK is a lazy-loading cache. Instead of warming the cache in advance, Django KCK lets a developer tell the cache how to prime itself in the event of a cache miss.

If we don't warm the cache in advance and we ask the cache for a data product that depends on a hundred other data products in the cache, each of which either gathers or computes data from other sources, then this design will only generate or request the data that is absolutely necessary for the computation. In this way, Django KCK is able to do the last amount of work possible to accomplish the task.

To further expedite the process or building derivative data products, Django KCK includes mechanisms that allow for periodic or triggered updates of data upon which a data product depends, such that it will be immediately available when a request is made.

It also makes it possible to "augment" derivative data products with new information so that, for workloads that can take advantage of the optimization, a data product can be updated in place, without regenerating the product in its entirety. Where it works, this approach can turn minutes of computation into milliseconds.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

django-kck-0.0.50.tar.gz (12.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

django_kck-0.0.50-py3-none-any.whl (16.0 kB view details)

Uploaded Python 3

File details

Details for the file django-kck-0.0.50.tar.gz.

File metadata

  • Download URL: django-kck-0.0.50.tar.gz
  • Upload date:
  • Size: 12.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for django-kck-0.0.50.tar.gz
Algorithm Hash digest
SHA256 8d99c8c571bfde7eadf3d1180a46a3627d62da837eafdc75542195869abaa5db
MD5 d59fddd3126e7f77297bfa0d3ed420ff
BLAKE2b-256 e1eab86a14741f136b8719b0eee217719609c77cc1dbab6a987963e0b58bd643

See more details on using hashes here.

File details

Details for the file django_kck-0.0.50-py3-none-any.whl.

File metadata

  • Download URL: django_kck-0.0.50-py3-none-any.whl
  • Upload date:
  • Size: 16.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for django_kck-0.0.50-py3-none-any.whl
Algorithm Hash digest
SHA256 7af67bd88562e4e2dd2d73cbbe712d9479a3f812bbf8f07e2860f02ba1d47cc1
MD5 29f146da483f894147d11007bf5c162a
BLAKE2b-256 f9cee59475eec02dcbac62a4d60f137e7461af7e28d73e6f9b6c3690324c235e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page