Skip to main content

Backend for Karp

Project description

karp-backend-5

This package is the legacy version of Karp, [go here for the current version(https://github.com/spraakbanken/karp-backend)]

master

codecov

Karp is the lexical platform of Språkbanken. Now migrated to Python 3.6+.

Karp in Docker

For easy testing, use Docker to run Karp-b.

  • Follow the steps given here

  • Run docker-compose up -d

  • Test it by running curl localhost:8081/app/test

If you want to use Karp without Docker, keep on reading.

Prerequisites

Installation

Karp uses virtuals envs for python. To get running:

  • run make install
  • or:
    1. Create the virtual environment using python3 -m venv venv.
    2. Activate the virtual environment with source venv/bin/activate.
    3. pip install -r requirements.txt

Configuration

Set the environment varibles KARP5_INSTANCE_PATH and KARP5_ELASTICSEARCH_URL:

  1. using export VAR=value
  2. or creating a file .env in the root of your cloned path with VAR=value
  3. KARP5_INSTANCE_PATH - the path where your configs are. If you have cloned this repo you can use /path/to/karp-backend/.
  4. KARP5_ELASTICSEARCH_URL - the url to elasticsearch. Typically localhost:9200

Copy config.json.example to config.json and make your changes. You will also need to make configurations for your lexicons. Read more here.

Tests

TODO: DO MORE TESTS! Run the tests by typing: make test

Test that karp-backend is working by starting it make run or python run.py

Known bugs

Counts from the statistics call may not be accurate when performing subaggregations (multiple buckets) on big indices unless the query restricts the search space. Using breadth_first mode does not (always) help.

Possible workarounds:

  • use composite aggregation instead, but this does not work with filtering.
  • set a bigger shard_size (27 000 works for saldo), but this might break your ES cluster.
  • have smaller indices (one lexicon per index) but this does not help for big lexicons or statistics over many lexicons.
  • don't allow deeper subaggregations than 2. Chaning the size won't help.

Elasticsearch

If saving stops working because of Database Exception: Error during update. Message: TransportError(403, u'cluster_block_exception', u'blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];')., you need to unlock the relevant ES index.

This is how you do it:

Repeat for every combination of host and port that is relevant for you. But you only need to do it once per cluster.

  • Check if any index is locked: curl <host>:<port>/_all/_settings/index.blocks*
    • If all is open, Elasticsearch answers with {}
    • else it answers with {<index>: { "settings": { "index": { "blocks": {"read_only_allow_delete": "true"} } } }, ... }
  • To unlock all locked indices on a host and port:
    • curl -X PUT <host>:<port>/_all/_settings -H 'Content-Type: application' -d '{"index.blocks.read_only_allow_delete": null}'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

karp-backend-5-5.29.0.tar.gz (2.4 MB view details)

Uploaded Source

Built Distribution

karp_backend_5-5.29.0-py3-none-any.whl (1.0 MB view details)

Uploaded Python 3

File details

Details for the file karp-backend-5-5.29.0.tar.gz.

File metadata

  • Download URL: karp-backend-5-5.29.0.tar.gz
  • Upload date:
  • Size: 2.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.8

File hashes

Hashes for karp-backend-5-5.29.0.tar.gz
Algorithm Hash digest
SHA256 fd4816665f816d2d9e9143cc46a4f2d7edf332df12fa9eb88fdcf981c749fedd
MD5 db4bee7016235d37bc31a2bd55e5ca59
BLAKE2b-256 e73ec4c1b868e0ea39be0f94a8c658072123b989be72e87a7c762d6af8502335

See more details on using hashes here.

File details

Details for the file karp_backend_5-5.29.0-py3-none-any.whl.

File metadata

File hashes

Hashes for karp_backend_5-5.29.0-py3-none-any.whl
Algorithm Hash digest
SHA256 736127bcdbb4b15c597cdee5ad003e89baf7a3be699e76af022b9486daa35c39
MD5 7cfd626ce38e47ac7c21af9611edcf4c
BLAKE2b-256 ca12bedfb53e8259e7254b22f1118e9553b41eab8e0ea4adb715a3d8f51ce537

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page