Sync database between Django backends
Project description
Django Data Sync
Enables you to sync insensitive data (including FileField) between environments with any Django backends (as long the model definitions are the same) directly from admin interface.
DISCLAIMER
There are no rigorous tests, yet. I haven't got the chance to explore how it behaves with complex relationships. So far, it has been used in two production grade projects where the models are not too complex (ManyToMany is not yet properly tested).
Please use this at your own risk of data lost when syncing, or you can do rigorous testing at your development phase.
Features
- enables you to sync insensitive data between the same Django environments (as long the model definitions are the same) directly from admin interface
- relation fields are supported (ManyToMany needs to be tested)
- synchronous sync or in background (only Cloud Tasks is supported)
TO BE ADDED
add support for ImageField and FileFieldDONEsupport multiple tasks queues, current plan is to support GCP Cloud TasksDONE- add authorization and authentication at data export endpoint
- add tests, since it's not possible to test with two Django servers locally (or there is?), I have to think how to implement this correctly
MIGHT GET ADDED
- compare data in JSON for audit purpose
- add support for another tasks queues so that is cloud platform agnostic
Installation
pip install django-data-sync
add data_sync
to your INSTALLED_APPS
...
...
'data_sync',
....
....
Run migrate
python manage.py migrate data_sync
Add to urlpatterns. Please do take note of the prefix URLs it will be used
later.
e.g. most likely we will include this in api
App, thus the prefix is /api
.
path('', include('data_sync.urls')),
Preface
Data Sync works by making use of natural key. So I heavily recommend to read django docs on this topic before going further.
You need to analyze your models and define their natural keys.
You can infer their natural keys usually from unique fields (and or unique_together
).
Fields that are defined as unique or in unique_together
can be defined by
only using the field name e.g. a Language is related to a Country.
In Language definition,
the unique_together
is usually the Country + the Language's ISO 639-1.
In code it'll look something like this
unique_together = (( 'country', 'code'),)
Notice that country
in unique_together itself is abstract.
What defines a country?
In context of unique_together
it will be their ID, but ID is not natural key.
Country's natural key should be their ISO 2 code.
So we can infer that natural key of Language, programmatically, is the Country's ISO 2 code + the Language's ISO 639-1
It'll look like this when you implement in code
class Language(models.Model):
def natural_key(self):
return (self.country.code, self.code,)
In essence, natural key is usually combination of unique fields and or
unique_together
, but it needs to be more verbose.
Usage
To get Data Sync working, you need to register the models that want to be synced. Only register insensitive models e.g. copy. Never sync sensitive models e.g. User as it can expose very sensitive data.
To register the models, you need to decorate them and use custom managers.
from django.db import models
import data_sync
@data_sync.register_model(natural_key=['code'])
class Country(models.Model):
objects = data_sync.managers.DataSyncEnhancedManager()
code = models.CharField(max_length=2) # iso2
....
....
@data_sync.register_model(natural_key=['country.code', 'code'])
class Language(models.Model):
objects = data_sync.managers.DataSyncEnhancedManager()
code = models.CharField(max_length=2) # iso 639-1
....
....
@data_sync.register_model(
natural_key=['language.country.code', 'language.code', 'key'],
fields=('value', 'key', 'language'),
file_fields=('thumbnail',)
)
class Copy(models.Model):
objects = data_sync.managers.DataSyncEnhancedManager()
language = models.ForeignKey(Language, on_delete=models.CASCADE)
value = models.TextField()
key = models.CharField(max_length=255)
default = models.TextField()
thumbnail = models.ImageField()
....
....
@data_sync.register_model
Here you need to define your natural key (read Preface for further topic).
If natural key has value in related field, you need to use . (dot) notation.
You can also pass argument to fields
parameter if you want to limit which
fields that you want to be synced.
To add FileField into Data Sync, add them into file_fields
parameter.
DataSyncEnhancedManager
It looks like manager initialization is done at class loading. So adding custom manager programmatically might be considered hacky (I would really like to love input on this).
For now, I'm afraid you must define custom manager, with the default
attribute name i.e. objects
to use DataSyncEnhancedManager.
DataSyncEnhancedManager just adds a get_by_natural_key method
and no other
else.
Worker tasks
When the code is deployed to GAE (and GAE only, flex and kube not supported yet),
data_sync
automatically uses Cloud Tasks with the queue id of data_sync
.
Settings and Configuration
Data sync should work without additional settings (if using synchronous mode which is the default).
If you are deploying to GAE, it automatically uses Cloud Tasks, which you should fill the optionals below.
Optionals
DATA_SYNC_SERVICE_ACCOUNT_EMAIL
Defaults to `` (empty string). You need to fill this with GCP service account. You can use GAE default service account. It is needed for OIDC validation as recommended by GCP.
DATA_SYNC_FORCE_SYNC
Defaults to False
. Set this to True
if you want to use synchronous
when deployed to GAE.
DATA_SYNC_CLOUD_TASKS_QUEUE_ID
Defaults to data_sync
DATA_SYNC_CLOUD_TASKS_LOCATION
Defaults to europe-west1
DATA_SYNC_GOOGLE_CLOUD_PROJECT
Defaults to value of env var of GOOGLE_CLOUD_PROJECT
.
DATA_SYNC_GAE_VERSION
Defaults to value of env var of GAE_VERSION
, which is already set by GAE.
DATA_SYNC_GAE_SERVICE
Defaults to value of env var of GAE_SERVICE
, which is already set by GAE.
Data Source
Data Source holds information about an environment from which you want your data to be synced.
The URL is dependant on where and how you include the data_sync.urls
at
installation phase.
For example, if you include data_sync.urls
in your api
App urlpatterns,
then the URL in data source must be appended with your api
URL.
Thus it might look something like this https://example.com/api
.
If you include data_sync.urls
in your root urls
, then Data Source URL will
look like this https://example.com
.
Do not include endslash.
The Sync
To do a sync, simply create a Data Pull
Compatibility
Python 3.7, Django 2.2 and up
Testing
No automated tests (yet.....).
To test locally, you can spawn two django servers with different ports and different database and set the Data Source accordingly.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file django_data_sync-0.5.2.tar.gz
.
File metadata
- Download URL: django_data_sync-0.5.2.tar.gz
- Upload date:
- Size: 15.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0ead9f371509c05b3e16849a7ab3103385a3ef4137a9194e8976f46e66b19a51 |
|
MD5 | 1abe065feab6d5708702e2e948e4fb66 |
|
BLAKE2b-256 | 95fb5cfcb1ff7ed67c2db7f69544d2470dbe58c885287c3410bb5dccbaa771d4 |
File details
Details for the file django_data_sync-0.5.2-py3-none-any.whl
.
File metadata
- Download URL: django_data_sync-0.5.2-py3-none-any.whl
- Upload date:
- Size: 17.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d438e975604fc948a57b4917d9396d0478316992509afd0f720fbf64cbdb7edb |
|
MD5 | a0ab68ec605a97af3b68e8b17209449b |
|
BLAKE2b-256 | 754f37279ad37feb32198bbf5b6544c8006252762c93b6305c074ecca56eae06 |