Kahi plugin to insert and update works from openalex
Project description
Kahi works plugin
Kahi will use this plugin to insert or update the works information from openalex database.
Description
Plugin that read the information from a mongodb database with openalex information to update or insert the information of the research products in CoLav's database format.
Installation
You could download the repository from github. Go into the folder where the setup.py is located and run
pip3 install .
From the package you can install by running
pip3 install kahi_openalex_works
Dependencies
Software dependencies will automatically be installed when installing the plugin. For the data dependencies the user must have a copy of the openalex dump with the collection of works of interest (take a subset since this database is huge) which can be downloaded at OpenAlex data dump website and import it on a mongodb database. C++ library libhunspell-dev must be installed on your system. On ubuntu you can do it by typing
$ sudo apt install libhunspell-dev
Similarity support
To process works without doi, similarity is mandaroty, then a elastic search server must be running. The plugin will use the server to find the most similar works in the database. To deploy it please read https://github.com/colav/Chia/tree/main/elasticsaerch and follow the instructions.
Docker and docker-compose are required to deploy the server.
if you only wants to process works with doi, you can skip this step and remove the es_index, es_url, es_user and es_password from the yaml file.
But it is mandatory to put openalex_works/doi
in the yaml file.
Usage
To use this plugin you must have kahi installed in your system and construct a yaml file such as
config:
database_url: localhost:27017
database_name: kahi
log_database: kahi_log
log_collection: log
workflow:
openalex_works/doi:
database_url: localhost:27017
database_name: openalex
collection_name: works
num_jobs: 20
es_index: kahi_es
es_url: http://localhost:9200
es_user: elastic
es_password: colav
verbose: 5
openalex_works:
database_url: localhost:27017
database_name: openalex
collection_name: works
num_jobs: 20
es_index: kahi_es
es_url: http://localhost:9200
es_user: elastic
es_password: colav
verbose: 5
- WARNING *. This process could take several hours
License
BSD-3-Clause License
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file kahi_openalex_works-0.1.7b0.tar.gz
.
File metadata
- Download URL: kahi_openalex_works-0.1.7b0.tar.gz
- Upload date:
- Size: 9.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6904d93e55d3bda38712787de031d9687557a8eb12368216b8a7b8d6c018c623 |
|
MD5 | 4039bb81073330f0d2282330b02e85ba |
|
BLAKE2b-256 | cdeeedae62f5f33c0f75981f47867b3c5bf23e5b07033045f2a53616ea8bfb4f |
File details
Details for the file Kahi_openalex_works-0.1.7b0-py3-none-any.whl
.
File metadata
- Download URL: Kahi_openalex_works-0.1.7b0-py3-none-any.whl
- Upload date:
- Size: 10.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 812c2806748006c327512d3a1614859008bc280a1552c5b40f480dcb74996b5d |
|
MD5 | 0eb020dca9782acac229daaa93c103d5 |
|
BLAKE2b-256 | 7726b2da30eb43a5232761cc0f242f4aa9cfdccced7e923ac3b8f903268ad576 |