Skip to main content

A command line tool written in Python to help quickly populate json data into Elasticsearch.

Project description

es-indexer

MIT license Linux macOS windows

es-indexer (Elasticsearch Indexer) is a simple concurrent command line tool written in Python to help you quickly populate some json data into Elasticsearch.

About

Usually you'll have to use a third-party software or a client library to index data to Elasticsearch and setting that up can be really time consuming and tiresome (coughlogstashcough). es-indexer helps in indexing raw contents of *.json documents quickly with the help of multi-processing.

es-indexer currently doesn't provide any syncing of the data, you'll have to reindex the data if it changes, but will always populate a new index and then create an alias, the old data will be present while re-indexing until the new index is fully populated. A future update might include syncing.

Since Elasticsearch exposes a REST-API on Port 9200, there's no need for es-indexer providing a REST-API itself.

Installation

Requires Python 3.x and is compatible with Elasticsearch 7.x.x.

  • es-indexer can be installed with the help of pip.
    $ pip install es-indexer
    

(OR)

  • Clone the repository.

    $ git clone https://github.com/itsron717/es-indexer.git
    
  • Move inside the repo.

    $ cd es-indexer
    
  • Install the package locally.

    $ pip install .
    

Usage

Config

You need to create a config.yml before running es-indexer:

host: http://127.0.0.1:9200
index: twitter-example
type: documents
mapping:
    settings:
        number_of_shards: 1
        number_of_replicas: 0

You can provide a custom mapping in the config file, es-indexer will convert the yaml mapping 1:1 to json.

$ es-indexer --config path/to/config/file --source path/to/json/folder

Adding more Data Sources

More data sources other than json such as SQL, Filesystem, etc are also to be added to the es-indexer tool such that it can be a one stop shop for all the indexing needs of Elasticsearch. Anybody who'd like to contribute in integrating other data sources can raise and issue and we can start working on it!.

To-DO

  • Add json support.
  • Add SQL data source integration.
  • Add FileSystem data source integration.
  • Increase the speed of indexing.
  • Add tests.

References

es-indexer was built using the insipiration of this amazing tool written in Go.

License

The MIT License (MIT)

Copyright (c) 2019 Rounak Vyas

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

es-indexer-0.1.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

es_indexer-0.1-py3-none-any.whl (3.7 kB view details)

Uploaded Python 3

File details

Details for the file es-indexer-0.1.tar.gz.

File metadata

  • Download URL: es-indexer-0.1.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for es-indexer-0.1.tar.gz
Algorithm Hash digest
SHA256 ac22045fac159d71a408d790d289304dc92f3f0eabd637ad6619acd6b91b30f2
MD5 9d95cc023a98ca1e894606604a04054b
BLAKE2b-256 6c087c66368593f6e0d02efd23879701c4fd5f749dd820731f556b653a01337c

See more details on using hashes here.

File details

Details for the file es_indexer-0.1-py3-none-any.whl.

File metadata

  • Download URL: es_indexer-0.1-py3-none-any.whl
  • Upload date:
  • Size: 3.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for es_indexer-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 db2aa1411c83d2f097d41a56c442113e197871297455261dd96bcf7bc617b802
MD5 cbc5f903ad1efe73b79c69d64194a575
BLAKE2b-256 cd17f432ec349668e7e1f0b31feb53c9cff392dbd3593bc95be2c95a1adfe622

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page