Skip to main content

Kahi plugin to insert and update works from openalex

Project description

Kahi works plugin

Kahi will use this plugin to insert or update the works information from openalex database.

Description

Plugin that read the information from a mongodb database with openalex information to update or insert the information of the research products in CoLav's database format.

Installation

You could download the repository from github. Go into the folder where the setup.py is located and run

pip3 install .

From the package you can install by running

pip3 install kahi_openalex_works

Dependencies

Software dependencies will automatically be installed when installing the plugin. For the data dependencies the user must have a copy of the openalex dump with the collection of works of interest (take a subset since this database is huge) which can be downloaded at OpenAlex data dump website and import it on a mongodb database. C++ library libhunspell-dev must be installed on your system. On ubuntu you can do it by typing

$ sudo apt install libhunspell-dev

Similarity support

To process works without doi, similarity is mandaroty, then a elastic search server must be running. The plugin will use the server to find the most similar works in the database. To deploy it please read https://github.com/colav/Chia/tree/main/elasticsaerch and follow the instructions.

Docker and docker-compose are required to deploy the server.

if you only wants to process works with doi, you can skip this step and remove the es_index, es_url, es_user and es_password from the yaml file.

But it is mandatory to put openalex_works/doi in the yaml file.

Usage

To use this plugin you must have kahi installed in your system and construct a yaml file such as

config:
  database_url: localhost:27017
  database_name: kahi
  log_database: kahi_log
  log_collection: log
workflow:
  openalex_works/doi:
    database_url: localhost:27017
    database_name: openalex
    collection_name: works
    num_jobs: 20
    es_index: kahi_es
    es_url: http://localhost:9200
    es_user: elastic
    es_password: colav
    verbose: 5
  openalex_works:
    database_url: localhost:27017
    database_name: openalex
    collection_name: works
    num_jobs: 20
    es_index: kahi_es
    es_url: http://localhost:9200
    es_user: elastic
    es_password: colav
    verbose: 5
  • WARNING *. This process could take several hours

License

BSD-3-Clause License

Links

http://colav.udea.edu.co/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kahi_openalex_works-0.1.8.tar.gz (10.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

Kahi_openalex_works-0.1.8-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file kahi_openalex_works-0.1.8.tar.gz.

File metadata

  • Download URL: kahi_openalex_works-0.1.8.tar.gz
  • Upload date:
  • Size: 10.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.21

File hashes

Hashes for kahi_openalex_works-0.1.8.tar.gz
Algorithm Hash digest
SHA256 1d9b986280856dce93dadd03cc91c4b9bb9f0598c0a2fc1f851f9142db1666b1
MD5 55b6b13a766baf2a83ab48ac3ffbbbff
BLAKE2b-256 adc9d57c5788364b6c660117fe1b842b163a57ea3244e4ccf0206d6bc8d7d99c

See more details on using hashes here.

File details

Details for the file Kahi_openalex_works-0.1.8-py3-none-any.whl.

File metadata

File hashes

Hashes for Kahi_openalex_works-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 222da1b62d588f6037d4e8f4608108df04cde8f27e9fa78a868adde30fcf8d71
MD5 0abc174e2317193fba4d0a72353b3078
BLAKE2b-256 e6729f995f8e2a98888ff19d802103f5d635e842a741495e9be761216fc1a397

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page