Index Migration for ElasticSearch
Project description
Extension for the official ElasticSearch python client providing an indices_manager to create and manage indices with read and write aliases, and perform no-downtime migrations.
Installation
pip install slingshot
Usage
Instantiation
from weakref import proxy from elasticsearch.client import Elasticsearch from slingshot.indices_manager import IndicesManagerClient es = Elasticsearch() es.indices_manager = IndicesManagerClient(proxy(es))
Creation of a Managed Index
es.indices_manager.create('slingshot', body={"settings": {"number_of_shards": 1, "number_of_replicas": 1}})
This creates an index with read and write aliases:
Creates the index “slingshot.{creation_timestamp}”
Creates a read alias “slingshot”
Creates a write alias “slingshot.write”
Upgrading an Existing Index
Slingshot manages the read and write aliases for the indices it creates. However, you can upgrade an index that was not created with slingshot. It will simply create a write alias to handle migrations.
es.indices_manager.manage('existing_index')
Migration of a Managed Index
es.indices_manager.migrate('slingshot', body={"settings": {"number_of_shards": 5, "number_of_replicas": 1}})
This allows to perform changes to an index and migrate documents to take advantage of new mappings:
creates a new index “slingshot.{modification_timestamp}” with a new configuration (e.g. 5 shards instead of 1)
swaps write alias to the new index
scans and bulk imports all documents (optionally ignoring types or performing transformations)
swaps read alias
deletes original index (can be skipped)
Note that the index must be created or upgraded with slingshot (by creating a write alias or using the manage method)
Transforming Documents
When migrating, it can be useful to transform documents to match a new mapping.
def transform_my_docs(doc): # recompute some fields doc['_source']['discount'] = doc['_source']['price'] / doc['_source']['value'] * 100.0 # drop some fields doc['_source'].pop('useless') # drop documents based on some business rules (assumes the field is first cast to a datetime) if doc['_source]]['expires_at'] < datetime.now(): return None # Don't forget to return the modified document return doc es.indices_manager.migrate('slingshot', body=config_dict_or_string, transform=transform_my_docs)
Ignoring Document Types
It can also be useful to ignore some document types altogether.
es.indices_manager.migrate('slingshot', body=config_dict_or_string, ignore_types=["my_type_1", "my_type_2"])
Keeping the Source Index
If for any reason you wish to keep the original index (e.g. to rollback in case anything goes wrong) after the migration:
es.indices_manager.migrate('slingshot', body=config_dict_or_string, keep_source=True)
Warning
Slingshot is unable to predict what needs to be done with the settings, mappings, aliases, etc. of the new index.
Therefore, when migrating, body must contain all the relevant configuration to create an index from scratch. This can include settings, mappings, aliases, warmers or anything supported by the elasticsearch index API.
However, slingshot manages the migration of the write alias and the read alias (if it exists).
Running Tests
Get a copy of the repository:
git clone git@github.com:OohlaLabs/slingshot.git .
Install tox:
pip install tox
Run the tests:
tox
Contributions
All contributions and comments are welcome. Simply create a pull request or report a bug.
Changelog
v0.0.5
Reindex percolators after migrating data
v0.0.4
Allow passing create and copy kwargs to migrate
v0.0.3
Fix compatibility issues with latest versions of elasticsearch-py (<2.0.0)
Add support for parallel_bulk when migrating/copying
Reindex percolators when migrating/copying
v0.0.2
Fix six requirement to minimum version instead of exact version
v0.0.1
Initial
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file slingshot-0.0.5.tar.gz
.
File metadata
- Download URL: slingshot-0.0.5.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4223f3e2bcd40c1e7c8c183bc5fc13e835ed66be0e8d8737358cda949bccc524 |
|
MD5 | 6396ad079a21867aa10752de15b828bb |
|
BLAKE2b-256 | 08d39dd34fefa04ab98a3e46f6bcf5af5baadc1f71c1a2a30b2ebcc2374f93b4 |
File details
Details for the file slingshot-0.0.5-py2.py3-none-any.whl
.
File metadata
- Download URL: slingshot-0.0.5-py2.py3-none-any.whl
- Upload date:
- Size: 7.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b59a7468fe021199fbea12c32790389f74511af009e13fd6bfa6cd5b08043e4d |
|
MD5 | 636f2507874f9f6fcf443ecedcd21709 |
|
BLAKE2b-256 | a33880e67b06fc22dfe5c2ba2a12580c9808e2c5d390255062235b34409ee45f |