Skip to main content

Python package for simple migration elasticsearch indexes between servers.

Project description

Elasticsearch Reindex

forthebadge made-with-python

Code style: black Imports: isort Pre-commit: enabled

How to use

  1. Install project as the package:

    $ python setup.py install

  2. Whitelist source_host Elasticsearch in dest_host Elasticsearch.

You should edit Elasticsearch YML config:

Config path:

$ /etc/elasticsearch/elasticsearch.yml

Add setting to the end of the file:

$ reindex.remote.whitelist: <es-source-host>:<es-source-port>
  1. Use CLI for run migration data:

    $ elasticsearch_reindex --source_host=http://es-source-host:es-source-port --dest_host=http://es-dest-host:es-dest-port --check_interval=5 --concurrent_tasks=3

CLI Params description (example):

Required fields:

  • source_host - Elasticsearch endpoint where data will be extracted.

  • dest_host - Elasticsearch endpoint where data will be transfered.

Optional fields:

  • check_interval - Time period (in second) to check task success status.

    Default value - 10 (seconds)

  • concurrent_tasks - How many parallel task Elasticsearch will process.

    Default value - 1 (sync mode)

  • indexes - List of user ES indexes to migrate instead of all source indexes.

Run library from Python script:

from elasticsearch_reindex import Manager

INIT_CONFIG = {
    "source_host": "http://localhost:9201",
    "dest_host": "http://localhost:9202",
    "check_interval": 20,
    "concurrent_tasks": 5,
}


def main():
    manager = Manager.from_dict(data=INIT_CONFIG)
    manager.start_reindex()


if __name__ == "__main__":
    main()

With custom user indexes:

from elasticsearch_reindex import Manager

INIT_CONFIG = {
    "source_host": "http://localhost:9201",
    "dest_host": "http://localhost:9202",
    "check_interval": 20,
    "concurrent_tasks": 5,
    "indexes": ["es-index-1", "es-index-2", "es-index-n"]
}


def main():
    manager = Manager.from_dict(data=INIT_CONFIG)
    manager.start_reindex()


if __name__ == "__main__":
    main()

Local install

Setup and activate a python3 virtualenv via your preferred method. e.g. and install production requirements:

$ make ve

To remove virtualenv:

$ make clean

To install github hooks:

$ make install_hooks

Create .env file and fill the data:

$ cp .env.example .env

Export env variables:

$ export $(xargs < .env)

Env variables description:

Variable for enable testing:

  • ENV - variable for enable testing mode. For activate test mode set to value - test.

Elasticsearch docker settings:

  • ES_SOURCE_PORT - Source Elasticsearch port

  • ES_DEST_PORT - Destination Elasticsearch port

  • ES_VERSION - Elasticsearch version

  • LOCAL_IP - Address of you local host machine in LAN.

You can find it:

  • Mac OS:

    $ ifconfig | grep "inet " | grep -v 127.0.0.1 | cut -d\ -f2 | head -n 1

  • Linux (find it in response):

    $ ip r

Tests

Firstly up docker-compose services with 2 nodes of ElasticSearch:

$ docker-compose up -d

Ensure that Elasticsearch nodes started correctly.

Env variables set from .env file.

For Source Elasticsearch:

$ curl -X GET $LOCAL_IP:$ES_SOURCE_PORT

For destination Elasticsearch:

$ curl -X GET $LOCAL_IP:$ES_DEST_PORT

Export to PYTHONPATH env variable:

$ export PYTHONPATH="."

For run tests with pytest use:

$ pytest ./tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

elasticsearch-reindex-1.1.1.tar.gz (9.4 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page