Index DBnomics data with Apache Solr for full-text and faceted search
Project description
DBnomics Solr
Index DBnomics data into Apache Solr for full-text and faceted search.
Requirements:
- a running instance of Apache Solr; at the time this documentation is written, we use the version 7.3.
See dbnomics-docker to run a local DBnomics instance with Docker that includes a service for Apache Solr.
Configuration
Environment variables:
DEBUG_PYSOLR
: display pysolr DEBUG logging messages (cf https://github.com/django-haystack/pysolr)
Index a provider
Replace wto
by the real provider slug in the following command:
dbnomics-solr index-provider /path/to/wto-json-data
Full mode vs incremental mode
When data is stored in a regular directory, the script always indexes all datasets and series of a provider. This is called full mode.
When data is stored in a Git repository, the script runs by default in incremental mode: it indexes only the datasets modified since the last indexation.
It is possible to force the full mode with the --full
option.
Bare repositories
The script has an option --bare-repo-fallback
which tries to add .git
at the end of the storage dir name, if not found.
Remove all data from a provider
To remove all the documents related to a provider (type:provider
, type:dataset
and type:series
):
dbnomics-solr --debug delete-provider <provider_code>
Example:
dbnomics-solr --debug delete-provider WTO
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for dbnomics_solr-1.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 56da420171954aa9c1cc78270e6a90c111cd9481d348b1d8819be00ebb5d3f8c |
|
MD5 | 9d937a8165aee49b8fec51785e1f1376 |
|
BLAKE2b-256 | 265a128f488b895bec350b06f59072ad4a7972cffc2ee219d41fb5a79b7ef686 |