bulk_update_or_create for Django model managers
Project description
django-bulk-update-or-create
Everyone using Django ORM will eventually find himself doing batch update_or_create
operations: ingest files from external sources, sync with external APIs, etc.
If the number of records is big, the slowliness of QuerySet.update_or_create
will stand out: it is very practical to use but it always does one SELECT
and then one INSERT
(if select didn't return anything) or UPDATE
/.save
(if it did).
Searching online shows that this does indeed happen to quite a few people though it doesn't seem to be any good solution:
bulk_create
is really fast if you know all records are new (and you're not using multi-table inheritance)bulk_update
does some nice voodoo to update several records with the sameUPDATE
statement (using a hugeWHERE
condition together withCASE
), but you need to be sure they all exist- UPSERTs (INSERT .. ON DUPLICATE KEY UPDATE) look interesting (TODO on different package) but they will be retricted by
bulk_create
limitations ==> cannot use on models with multi-table inheritance
This package tries to tackle this introducnig bulk_update_or_create
to model QuerySet/Manager:
update_or_create
:(1 SELECT + 1 INSERT/UPDATE) * N
bulk_update_or_create
:1 BIG_SELECT + 1 BIG_UPDATE + (lte_N) INSERT
For a batch of records:
SELECT
all from database (based on thematch_field
parameter)- Update records in memory
- Use
bulk_update
for those - Use
INSERT
/.create
on each of the remaining
The (SOFTCORE) performance test looks promising, more than 70% less time (average):
$ make testcmd
# default - sqlite
DJANGO_SETTINGS_MODULE=settings tests/manage.py bulk_it
loop of update_or_create - all creates: 3.966486692428589
loop of update_or_create - all updates: 4.020653247833252
loop of update_or_create - half half: 3.9968857765197754
bulk_update_or_create - all creates: 2.949239730834961
bulk_update_or_create - all updates: 0.15633511543273926
bulk_update_or_create - half half: 1.4585723876953125
# mysql
DJANGO_SETTINGS_MODULE=settings_mysql tests/manage.py bulk_it
loop of update_or_create - all creates: 5.511938571929932
loop of update_or_create - all updates: 5.321666955947876
loop of update_or_create - half half: 5.391834735870361
bulk_update_or_create - all creates: 1.5671980381011963
bulk_update_or_create - all updates: 0.14612770080566406
bulk_update_or_create - half half: 0.7262606620788574
# postgres
DJANGO_SETTINGS_MODULE=settings_postgresql tests/manage.py bulk_it
loop of update_or_create - all creates: 4.3584535121917725
loop of update_or_create - all updates: 3.6183276176452637
loop of update_or_create - half half: 4.145816087722778
bulk_update_or_create - all creates: 1.044851541519165
bulk_update_or_create - all updates: 0.14954638481140137
bulk_update_or_create - half half: 0.8407495021820068
Installation
pip install django-bulk-update-or-create
Add it to your INSTALLED_APPS
list in settings.py
Usage
- use
BulkUpdateOrCreateQuerySet
as manager of your model(s)
from django.db import models
from bulk_update_or_create import BulkUpdateOrCreateQuerySet
class RandomData(models.Model):
objects = BulkUpdateOrCreateQuerySet.as_manager()
uuid = models.IntegerField(unique=True)
data = models.CharField(max_length=200, null=True, blank=True)
- call
bulk_update_or_create
items = [
RandomData(uuid=1, data='data for 1'),
RandomData(uuid=2, data='data for 2'),
]
RandomData.objects.bulk_update_or_create(items, ['data'], match_field='uuid')
Docs
WIP
ToDo
- Docs!
- Add option to use
bulk_create
for creates: assert model is not multi-table, if enabled - Fix the collation mess: the keyword arg
case_insensitive_match
should be dropped and collation detected in runtime - Add support for multiple
match_field
- probably will need to useWHERE (K1=X and K2=Y) or (K1=.. and K2=..)
instead ofIN
for those, as that SQL standard doesn't seem widely adopted yet - Link to
UPSERT
alternative package once done!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for django-bulk-update-or-create-0.1.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | d3383dc71d2b472929367b80e0de7df1eca98c0f53d74b88a69051b716110f92 |
|
MD5 | 0c2d3a563d1ade68d14f380d84c21ce4 |
|
BLAKE2b-256 | bc4bdf28bf7c10fb5f4a01337a7cb02f07485c24adf2237617f174ed6fa6f98d |
Hashes for django_bulk_update_or_create-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a54c72c66e7a93049e0fa73207eb46abe6c2afb45cf28b78a6838314f8c75149 |
|
MD5 | 3269bcc1b344a45c8007fd73b2fc89af |
|
BLAKE2b-256 | b8897034dba2d2b07b634af4a55b10a42df9de874e6e92c986dd5684ff2df59a |