Skip to main content

Export Solr Nodes to Elasticsearch Indexes

Project description

Solr to Elasticsearch Migrator

This will migrate a Solr node to an Elasticsearch index.

Requirements

  • Python 3+
    • elasticsearch
    • pysolr

Usage

usage: solr-to-es [-h] [--solr-query SOLR_QUERY] [--solr-fields COMMA_SEP_FIELDS]
                  [--rows-per-page ROWS_PER_PAGE] [--es-timeout ES_TIMEOUT]
                  solr_url elasticsearch_url elasticsearch_index doc_type

The following example will page through all documents on the local Solr, and submit them to the local Elasticsearch server in the index es_index with a document type of solr_docs.

solr-to-es http://localhost:8983/solr/<<collection_name>> http://localhost:9200 <<collection_name>> solr_docs

solr_url is the full url to your Solr,

elasticsearch_url is the url of your Elasticsearch server.

elasticsearch_index is the index you will submit the Solr documents to on Elasticsearch.

doc_type is the type of document Elasticsearch should assume you are importing.

--solr-query defaults to *:*

--solr-fields defaults to (i.e. all fields)

--rows-per-page defaults to 500

--es-timeout defaults to 60

--es-user for authentication in Elasticsearch

--es-password for authentication in Elasticsearch

--es-max-retries maximum number of times a document will be retried when 429 is received, set to 0 for no retries on 429

--es-initial-backoff number of seconds we should wait before the first retry. Any subsequent retries will be powers of initial_backoff * 2**retry_number

Install

Run python setup.py install to install the script.

Demo

Here is an example of grabbing the over 114 thousand journal articles from Plos.org API about animals.

docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch

solr-to-es --solr-query animal http://api.plos.org/search localhost:9200 es_plos solr_docs

curl http://localhost:9200/_cat/indices?v

Note: that you will get an 403 Forbidden error from the script, and that is because the solr.quepid.com doesn't allow deep paging, however you will have documents in your ES cluster.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

pds.solr_to_es-0.3.0-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file pds.solr_to_es-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pds.solr_to_es-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cfeea2e901876921ce43f0e0b17449c5095d90c8b1f156196060bc07944c09d4
MD5 1f226abd7e6249bc256c73625aa2a7cc
BLAKE2b-256 77d5e63532882cd7a57cd10db16ec3ebe7daf31f8a2d0454d4643799fa8d0fb3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page