Skip to main content

Python package for simple migration elasticsearch indexes between different elasticsearch nodes.

Project description

Elasticsearch Reindex

forthebadge made-with-python

Code style: black Checked with mypy Imports: isort Pre-commit: enabled

Description

elasticsearch-reindex is a CLI tool for transferring Elasticsearch indexes between different servers.

Installing

Install the package using pip:

pip install elasticsearch-reindex

Usage

Configuration

Ensure the source Elasticsearch host is whitelisted in the destination host. Edit the elasticsearch.yml configuration file on the destination Elasticsearch server.

You should edit Elasticsearch YML config:

Path to config file:

/etc/elasticsearch/elasticsearch.yml

Add the following line to the file:

reindex.remote.whitelist: <es-source-host>:<es-source-port>

Running the Tool

Use the CLI to migrate data between Elasticsearch instances:

elasticsearch_reindex \
        --source_host http(s)://es-source-host:es-source-port \
        --source_http_auth username:password \
        --dest_host http(s)://es-dest-host:es-dest-port \
        --dest_http_auth username:password \
        --check_interval 5 \
        --concurrent_tasks 3 \
        -i test_index_1 -i test_index_2

Also, there is a command alias elasticsearch-reindex:

elasticsearch-reindex ...

CLI Parameters

Required fields:

  • source_host - Elasticsearch endpoint where data will be extracted.

  • dest_host - Elasticsearch endpoint where data will be transfered.

Optional fields:

  • source_http_auth - HTTP Basic authentication, username and password.

  • dest_http_auth - HTTP Basic authentication, username and password.

  • check_interval - Time period (in second) to check task success status.

    Default value - 10 (seconds)

  • concurrent_tasks - How many parallel task Elasticsearch will process.

    Default value - 1 (sync mode)

  • indexes - List of user ES indexes to migrate instead of all source indexes.

Run library from Python script:

from elasticsearch_reindex import ReindexManager


def main() -> None:
  """
  Example reindex function.
  """
  dict_config = {
    "source_host": "http://localhost:9201",
    "dest_host": "http://localhost:9202",
    "check_interval": 20,
    "concurrent_tasks": 5,
  }
  reindex_manager = ReindexManager.from_dict(data=dict_config)
  reindex_manager.start_reindex()


if __name__ == "__main__":
  main()

With custom user indexes:

from elasticsearch_reindex import ReindexManager


def main() -> None:
  """
  Example reindex function with HTTP Basic authentication.
  """
  dict_config = {
    "source_host": "http://localhost:9201",
    # If the source host requires authentication
    # "source_http_auth": "tmp-source-user:tmp-source-PASSWD.220718",
    "dest_host": "http://localhost:9202",
    # If the destination host requires authentication
    # "dest_http_auth": "tmp-reindex-user:tmp--PASSWD.220718",
    "check_interval": 20,
    "concurrent_tasks": 5,
    "indexes": ["es-index-1", "es-index-2", "es-index-n"],
  }
  reindex_manager = ReindexManager.from_dict(data=dict_config)
  reindex_manager.start_reindex()


if __name__ == "__main__":
  main()

Local install

Set up and activate a Python 3 virtual environment:

make ve

To install Git hooks:

make install_hooks

Create .env file and fill the data:

cp .env.example .env

Export env variables:

export $(xargs < .env)

Key Environment Variables::

Variable for enable testing:

  • ENV - variable for enable testing mode. For activate test mode set to value - test.

Elasticsearch docker settings:

  • ES_SOURCE_PORT - Source Elasticsearch port

  • ES_DEST_PORT - Destination Elasticsearch port

  • ES_VERSION - Elasticsearch version

  • LOCAL_IP - Address of you local host machine in LAN like 192.168.4.106.

How to find your Local IP?

  • MacOS (find it in response):
ifconfig
  • Linux (find it in response):
ip r

Testing

Start Elasticsearch nodes using Docker Compose:

docker-compose up -d

Verify Elasticsearch nodes are running:

  • Source Elasticsearch:
curl -X GET $LOCAL_IP:$ES_SOURCE_PORT
  • Destination Elasticsearch:
curl -X GET $LOCAL_IP:$ES_DEST_PORT

Export to PYTHONPATH env variable:

export PYTHONPATH="."

For run tests with pytest use:

make test

For run tests with pytest and coverage report use:

make test-cov

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

elasticsearch_reindex-1.3.0.tar.gz (13.2 kB view details)

Uploaded Source

File details

Details for the file elasticsearch_reindex-1.3.0.tar.gz.

File metadata

  • Download URL: elasticsearch_reindex-1.3.0.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.0

File hashes

Hashes for elasticsearch_reindex-1.3.0.tar.gz
Algorithm Hash digest
SHA256 5db3a77de627011738da830a5944d26d1ddc0386fae9fa3de53bbac0cc261405
MD5 6e9cc3473b66adab394a17294e808418
BLAKE2b-256 72b7cc4d0ceedd506f18f2b1c99e0b0301016999969a59ceed899f699a8ce7f9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page