Skip to main content

Faster db updates for Django using UPDATE FROM VALUES sql variants.

Project description

test Coverage Status

django-fast-update

Faster db updates using UPDATE FROM VALUES sql variants.

Installation & Usage

Run pip install django-fast-update and place fast_update in INSTALLED_APPS.

With attaching FastUpdateManager as a manager to your model, fast_update can be used instead of bulk_update, e.g.:

from django.db import models
from fast_update.query import FastUpdateManager

class MyModel(models.Model):
    objects = FastUpdateManager()
    field_a = ...
    field_b = ...
    field_c = ...


# to update multiple instances at once:
MyModel.objects.fast_update(bunch_of_instances, ['field_a', 'field_b', 'field_c'])

Alternatively fast.fast_update can be used directly with a queryset as first argument (Warning - this skips most sanity checks with up to 30% speed gain, so make sure not to feed something totally off).

Compatibility

fast_update is known to work with these database versions:

  • SQLite 3.15+
  • PostgreSQL
  • MariaDB 10.2+
  • MySQL 5.7+

For unsupported database backends or outdated versions fast_update will fall back to bulk_update. (It is possible to register fast update implementations for other db vendors with register_implementation. See fast_update/fast.py for more details.)

Note that with fast_update f-expressions cannot be used anymore. This is a design decision to not penalize update performance by some swiss-army-knife functionality. If you have f-expressions in your update data, consider re-grouping the update steps and update those fields with update or bulk_update instead.

Other than with bulk_update duplicates in a changeset are not allowed and will raise. This is mainly a safety guard to not let slip through duplicates, where the final update state would be undetermined or directly depend on the database's compatibility.

copy_update

This is a PostgreSQL only update implementation based on COPY FROM. This runs even faster than fast_update for medium to big changesets (but tends to be slower than fast_update for <100 objects).

copy_update follows the same interface idea as bulk_update and fast_update, minus a batch_size argument (data is always transferred in one big batch). It can be used likewise from the FastUpdateManager. copy_update also has no support for f-expressions, also duplicates will raise.

Note copy_update will probably never leave the alpha/PoC-state, as psycopg3 brings great COPY support, which does a more secure value conversion and has a very fast C-version.

Note Django 4.2 brings psycopg3 support, which is currently not yet supported by copy_update. While psycopg2 will keep working as before, psycopg3 will raise on attempts to use copy_update until #16 got resolved.

Status

Currently beta, still some TODOs left.

The whole package is tested with Django 3.2 & 4.2 on Python 3.8 & 3.11.

Performance

There is a management command in the example app testing performance of updates on the FieldUpdate model (./manage.py perf).

fast_update is at least 8 times faster than bulk_update, and keeps making ground for bigger changesets. This indicates different runtime complexity. fast_update grows almost linear for very big numbers of rows (tested during some perf series against copy_update up to 10M), while bulk_update grows much faster (looks quadratic to me, which can be lowered to linear by applying a proper batch_size, but it stays very steep compared to fast_update).

For very big changesets copy_update is the clear winner, and even shows a substantial increase in updated rows/s (within my test range, as upper estimate this of course cannot grow slower than linear, as the data pumping will saturate to linear).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

django-fast-update-0.2.3.tar.gz (18.9 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page