Skip to main content

Kahi plugin to insert and update works from openalex

Project description

Kahi works plugin

Kahi will use this plugin to insert or update the works information from openalex database.

Description

Plugin that read the information from a mongodb database with openalex information to update or insert the information of the research products in CoLav's database format.

Installation

You could download the repository from github. Go into the folder where the setup.py is located and run

pip3 install .

From the package you can install by running

pip3 install kahi_openalex_works

Dependencies

Software dependencies will automatically be installed when installing the plugin. For the data dependencies the user must have a copy of the openalex dump with the collection of works of interest (take a subset since this database is huge) which can be downloaded at OpenAlex data dump website and import it on a mongodb database. C++ library libhunspell-dev must be installed on your system. On ubuntu you can do it by typing

$ sudo apt install libhunspell-dev

Similarity support

To process works without doi, similarity is mandaroty, then a elastic search server must be running. The plugin will use the server to find the most similar works in the database. To deploy it please read https://github.com/colav/Chia/tree/main/elasticsaerch and follow the instructions.

Docker and docker-compose are required to deploy the server.

if you only wants to process works with doi, you can skip this step and remove the es_index, es_url, es_user and es_password from the yaml file.

But it is mandatory to put openalex_works/doi in the yaml file.

Usage

To use this plugin you must have kahi installed in your system and construct a yaml file such as

config:
  database_url: localhost:27017
  database_name: kahi
  log_database: kahi_log
  log_collection: log
workflow:
  openalex_works/doi:
    database_url: localhost:27017
    database_name: openalex
    collection_name: works
    num_jobs: 20
    es_index: kahi_es
    es_url: http://localhost:9200
    es_user: elastic
    es_password: colav
    verbose: 5
  openalex_works:
    database_url: localhost:27017
    database_name: openalex
    collection_name: works
    num_jobs: 20
    es_index: kahi_es
    es_url: http://localhost:9200
    es_user: elastic
    es_password: colav
    verbose: 5
  • WARNING *. This process could take several hours

License

BSD-3-Clause License

Links

http://colav.udea.edu.co/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kahi_openalex_works-0.1.7b0.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

Kahi_openalex_works-0.1.7b0-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file kahi_openalex_works-0.1.7b0.tar.gz.

File metadata

  • Download URL: kahi_openalex_works-0.1.7b0.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for kahi_openalex_works-0.1.7b0.tar.gz
Algorithm Hash digest
SHA256 6904d93e55d3bda38712787de031d9687557a8eb12368216b8a7b8d6c018c623
MD5 4039bb81073330f0d2282330b02e85ba
BLAKE2b-256 cdeeedae62f5f33c0f75981f47867b3c5bf23e5b07033045f2a53616ea8bfb4f

See more details on using hashes here.

File details

Details for the file Kahi_openalex_works-0.1.7b0-py3-none-any.whl.

File metadata

File hashes

Hashes for Kahi_openalex_works-0.1.7b0-py3-none-any.whl
Algorithm Hash digest
SHA256 812c2806748006c327512d3a1614859008bc280a1552c5b40f480dcb74996b5d
MD5 0eb020dca9782acac229daaa93c103d5
BLAKE2b-256 7726b2da30eb43a5232761cc0f242f4aa9cfdccced7e923ac3b8f903268ad576

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page