Skip to main content

Index Migration for ElasticSearch

Project description

https://travis-ci.org/jthi3rry/slingshot.svg?branch=master https://coveralls.io/repos/jthi3rry/slingshot/badge.png?branch=master

Extension for the official ElasticSearch python client providing an indices_manager to create and manage indices with read and write aliases, and perform no-downtime migrations.

Installation

pip install slingshot

Usage

Instantiation

from weakref import proxy
from elasticsearch.client import Elasticsearch
from slingshot.indices_manager import IndicesManagerClient

es = Elasticsearch()
es.indices_manager = IndicesManagerClient(proxy(es))

Creation of a Managed Index

es.indices_manager.create('slingshot', body={"settings": {"number_of_shards": 1, "number_of_replicas": 1}})

This creates an index with read and write aliases:

  • Creates the index “slingshot.{creation_timestamp}”

  • Creates a read alias “slingshot”

  • Creates a write alias “slingshot.write”

Upgrading an Existing Index

Slingshot manages the read and write aliases for the indices it creates. However, you can upgrade an index that was not created with slingshot. It will simply create a write alias to handle migrations.

es.indices_manager.manage('existing_index')

Migration of a Managed Index

es.indices_manager.migrate('slingshot', body={"settings": {"number_of_shards": 5, "number_of_replicas": 1}})

This allows to perform changes to an index and migrate documents to take advantage of new mappings:

  • creates a new index “slingshot.{modification_timestamp}” with a new configuration (e.g. 5 shards instead of 1)

  • swaps write alias to the new index

  • scans and bulk imports all documents (optionally ignoring types or performing transformations)

  • swaps read alias

  • deletes original index (can be skipped)

Note that the index must be created or upgraded with slingshot (by creating a write alias or using the manage method)

Transforming Documents

When migrating, it can be useful to transform documents to match a new mapping.

def transform_my_docs(doc):
    # recompute some fields
    doc['_source']['discount'] = doc['_source']['price'] / doc['_source']['value'] * 100.0

    # drop some fields
    doc['_source'].pop('useless')

    # drop documents based on some business rules (assumes the field is first cast to a datetime)
    if doc['_source]]['expires_at'] < datetime.now():
        return None

    # Don't forget to return the modified document
    return doc

es.indices_manager.migrate('slingshot', body=config_dict_or_string, transform=transform_my_docs)

Ignoring Document Types

It can also be useful to ignore some document types altogether.

es.indices_manager.migrate('slingshot', body=config_dict_or_string, ignore_types=["my_type_1", "my_type_2"])

Keeping the Source Index

If for any reason you wish to keep the original index (e.g. to rollback in case anything goes wrong) after the migration:

es.indices_manager.migrate('slingshot', body=config_dict_or_string, keep_source=True)

Warning

Slingshot is unable to predict what needs to be done with the settings, mappings, aliases, etc. of the new index.

Therefore, when migrating, body must contain all the relevant configuration to create an index from scratch. This can include settings, mappings, aliases, warmers or anything supported by the elasticsearch index API.

However, slingshot manages the migration of the write alias and the read alias (if it exists).

Running Tests

Get a copy of the repository:

git clone git@github.com:OohlaLabs/slingshot.git .

Install tox:

pip install tox

Run the tests:

tox

Contributions

All contributions and comments are welcome. Simply create a pull request or report a bug.

Changelog

v0.0.4

  • Allow passing create and copy kwargs to migrate

v0.0.3

  • Fix compatibility issues with latest versions of elasticsearch-py (<2.0.0)

  • Add support for parallel_bulk when migrating/copying

  • Reindex percolators when migrating/copying

v0.0.2

  • Fix six requirement to minimum version instead of exact version

v0.0.1

  • Initial

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slingshot-0.0.4.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

slingshot-0.0.4-py2.py3-none-any.whl (7.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file slingshot-0.0.4.tar.gz.

File metadata

  • Download URL: slingshot-0.0.4.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for slingshot-0.0.4.tar.gz
Algorithm Hash digest
SHA256 ec23ba78aeac96443b745ef5b2d269431e53e6a3c42c583363b85a71fe221b71
MD5 e469282ed43efb226c936c58fb67c4e6
BLAKE2b-256 c80679926ec1a540b48240860f189c6c142772b52ee05c2f5ce3a125b866f099

See more details on using hashes here.

File details

Details for the file slingshot-0.0.4-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for slingshot-0.0.4-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 21a94d449d363265eb7f75ed8ea19e8a67b5c9c0930e3c19ee86f6c490152fbb
MD5 b9fdc6aebbaf8032eb204adbff1688aa
BLAKE2b-256 b82b654583213c1b15915138d1f75577f772dc9aa5de774443f030de333df92e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page