Skip to main content

udata search service

Project description

udata-search-service

A search service for udata. The idea is to have search service separated from the udata MongoDB. The indexation update is made using real-time HTTP messages.

See the following architecture schema: Udata Search Service architecture schema

Getting started

You can follow this recommended architecture for your code:

$WORKSPACE
├── fs
├── udata
│   ├── ...
│   └── setup.py
│		└── udata.cfg
├── udata-front
│   ├── ...
│   └── setup.py
└── udata-search-service
    ├── ...
    └── pyproject.toml

Clone the repository:

cd $WORKSPACE
git clone git@github.com:opendatateam/udata-search-service.git

Start the different services using docker-compose:

cd udata-search-service
docker-compose up

This will start:

  • an elasticsearch
  • a search app

Initialize the elasticsearch indices on setup.

# Locally
udata-search-service init-es

# In the docker context
docker-compose run --entrypoint /bin/bash web -c 'udata-search-service init-es'

This will create the following indices:

  • {UDATA_INSTANCE_NAME}-dataset-{yyyy}-{mm}-{dd}-{HH}-{MM}
  • {UDATA_INSTANCE_NAME}-reuse-{yyyy}-{mm}-{dd}-{HH}-{MM}
  • {UDATA_INSTANCE_NAME}-organization-{yyyy}-{mm}-{dd}-{HH}-{MM}

Configure your udata to use the search service, by updating the following variables in your udata.cfg. Ex in local:

    SEARCH_SERVICE_API_URL = 'http://127.0.0.1:5000/api/1/'

Using udata, when you modify objects, indexation messages will be sent to the search app and will be consumed by the API.

If you want to reindex your local mongo base in udata, you can run:

cd $WORKSPACE/udata/
source ./venv/bin/activate
udata search index

Make sure to have the corresponding UDATA_INSTANCE_NAME specified in your udata settings.

You can specify the option --reindex to start indexation on new indexes. At the end of this reindexation by udata, /set-index-alias route is called to change the alias accordingly.

You can query the search service with the search service api, ex: http://localhost:5000/api/1/datasets/?q=toilettes%20à%20rennes

Development

You can create a virtualenv, activate it and install the requirements with the following commands.

python3 -m venv venv
source venv/bin/activate
make deps
make install

You can start the web search service with the following command:

udata-search-service run

Deployment

The project depends on ElasticSearch 7.16.

Elasticsearch requires the Analysis ICU plugin for your specific version. On Debian, you can take a look at these instructions for installation.

Troubleshooting

  • If the elasticsearch service exits with an error 137, it is killed due to out of memory error. You should read the following points.
  • If you are short on RAM, you can limit heap memory by setting ES_JAVA_OPTS=-Xms750m -Xmx750m as environment variable when starting the elasticsearch service.
  • If you are on MAC and still encounter RAM memory issues, you should increase Docker limit memory to 4GB instead of default 2GB.
  • If you are on Linux, you may need to double the vm.max_map_count. You can set it with the following command: sysctl -w vm.max_map_count=262144.
  • If you are on Linux, you may encounter permissions issues. You can either create the volume or change the user to the current user using chown.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

udata_search_service-2.3.1.dev407.tar.gz (53.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

udata_search_service-2.3.1.dev407-py2.py3-none-any.whl (26.8 kB view details)

Uploaded Python 2Python 3

File details

Details for the file udata_search_service-2.3.1.dev407.tar.gz.

File metadata

File hashes

Hashes for udata_search_service-2.3.1.dev407.tar.gz
Algorithm Hash digest
SHA256 56f1c89e32b34c6f0a6e1bb1cf1e3863036e8e227118b5587b9d187d5e3183e7
MD5 e58918d7722c11c7355481964929b8e7
BLAKE2b-256 51e587a5568f2e1f038ebf2ac7845558f37d7dead34570fc8d456f58e7c82429

See more details on using hashes here.

File details

Details for the file udata_search_service-2.3.1.dev407-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for udata_search_service-2.3.1.dev407-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 af43dc9e3bab939eef1e8efa4d511ea1950739a7c4a66cf7931fea3b4798f2be
MD5 0d1122059db94a3a481d75ff8619f923
BLAKE2b-256 41feba80adbe810a4e197e94128a6985db1bc8911eee79400c541f7f3290b739

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page