Export Solr Nodes to Elasticsearch Indexes
Project description
Solr to Elasticsearch Migrator
This will migrate a Solr node to an Elasticsearch index.
Requirements
- Python 3+
- elasticsearch
- pysolr
Usage
usage: solr-to-es [-h] [--solr-query SOLR_QUERY] [--solr-fields COMMA_SEP_FIELDS]
[--rows-per-page ROWS_PER_PAGE] [--es-timeout ES_TIMEOUT]
solr_url elasticsearch_url elasticsearch_index doc_type
The following example will page through all documents on the local Solr, and submit them to the local Elasticsearch server in the index es_index
with a document type of solr_docs
.
solr-to-es http://localhost:8983/solr/<<collection_name>> http://localhost:9200 <<collection_name>> solr_docs
solr_url
is the full url to your Solr,
elasticsearch_url
is the url of your Elasticsearch server.
elasticsearch_index
is the index you will submit the Solr documents to on Elasticsearch.
doc_type
is the type of document Elasticsearch should assume you are importing.
--solr-query
defaults to *:*
--solr-fields
defaults to
(i.e. all fields)
--rows-per-page
defaults to 500
--es-timeout
defaults to 60
--es-user
for authentication in Elasticsearch
--es-password
for authentication in Elasticsearch
--es-max-retries
maximum number of times a document will be retried when 429 is received, set to 0 for no retries on 429
--es-initial-backoff
number of seconds we should wait before the first retry. Any subsequent retries will be powers of initial_backoff * 2**retry_number
Install
Run python setup.py install
to install the script.
Demo
Here is an example of grabbing the over 114 thousand journal articles from Plos.org API about animals.
docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch
solr-to-es --solr-query animal http://api.plos.org/search localhost:9200 es_plos solr_docs
curl http://localhost:9200/_cat/indices?v
Note: that you will get an 403 Forbidden error from the script, and that is because the solr.quepid.com doesn't allow deep paging, however you will have documents in your ES cluster.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file pds.solr_to_es-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: pds.solr_to_es-0.3.0-py3-none-any.whl
- Upload date:
- Size: 6.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cfeea2e901876921ce43f0e0b17449c5095d90c8b1f156196060bc07944c09d4 |
|
MD5 | 1f226abd7e6249bc256c73625aa2a7cc |
|
BLAKE2b-256 | 77d5e63532882cd7a57cd10db16ec3ebe7daf31f8a2d0454d4643799fa8d0fb3 |