No project description provided
Project description
DOI-Extractor-OEG
Description
DOI-Extractor-OEG is a tool for extracting all paper's name and DOI from OEG publications.
They are extracted from two main resources:
-
https://portalcientifico.upm.es/es/ipublic/entity/16247 , corresponding to all papers from OEG.
-
ExistingPapers/ Papers.csv with already extracted data from some OEG papers.
The resulting information is placed in Outputs folder, which include:
-
A dois.txt containing all the dois from the two resources
-
A name-doi.csv, containing the title and the doi of every paper found, in addition to OpenAlex primary location attribute
Project Structure
DOI-Extractor-OEG
├───doiExtractor
| ├───ExistingPapers
| | ├───name_doi_papers.csv
| | └───Papers.csv
| ├───Outputs
| | ├───dois.csv
| | └───name_doi.csv
| ├───__init__.py
| ├───doiExtractor.py
| ├───main.py
| └───openAlex.py
├───.gitignore
├───LICENSE.txt
├───README.MD
└───setup.py
doiExtractor.py
- Contains the functions to extract the name and doi from portalcientifico.upm.es
openAlex.py
- Contains the functions to extract the primary location from openAlex
Installation
-
Clone the repository:
git clone https://github.com/ptorija/DOI-Extractor-OEG.git
-
Change to the DOI-Extractor-OEG directory:
cd DOI-Extractor-OEG
-
Create a virtual environment:
python -m venv .env
-
Activate the virtual environment:
source .env/bin/activate
(Linux) or.env\Scripts\activate
(Windows) -
Install the package dependencies:
pip install -e .
Usage
The tool can be used from the command line with the following argument:
--start
- To start the doi extraction
The script will execute and extract DOIs from the specified webpage and then merge them with the ones from ExistingPapers.
Options:
--url <path>
- Specify the webpage of the group you want to extract the dois. Default: Ontology Engieniering Group--output <path>
- Specify the path for the output files. Default: Outputs/
Example
DataExtractorOEG --start
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for DataExtractorOEG-0.3.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c992fa467cb5751831671028b2fce9405a7037c9f901cdd287b2c121e985bfc3 |
|
MD5 | b4965c80ef14eb555c7912c9de16f6ae |
|
BLAKE2b-256 | d66edbff0538ff75a1d2edd17a60d8763e2696acd3983167354cebe894227a50 |