Skip to main content

No project description provided

Project description

DOI-Extractor-OEG

License

Description

DOI-Extractor-OEG is a tool for extracting all paper's name and DOI from OEG publications.

They are extracted from two main resources:

  1. https://portalcientifico.upm.es/es/ipublic/entity/16247 , corresponding to all papers from OEG.

  2. ExistingPapers/ Papers.csv with already extracted data from some OEG papers.


The resulting information is placed in Outputs folder, which include:
  • A dois.txt containing all the dois from the two resources

  • A name-doi.csv, containing the title and the doi of every paper found, in addition to OpenAlex primary location attribute

Project Structure

DOI-Extractor-OEG
├───doiExtractor
|   ├───ExistingPapers
|   |   ├───name_doi_papers.csv
|   |   └───Papers.csv
|   ├───Outputs
|   |   ├───dois.csv
|   |   └───name_doi.csv
|   ├───__init__.py
|   ├───doiExtractor.py
|   ├───main.py
|   └───openAlex.py
├───.gitignore
├───LICENSE.txt
├───README.MD
└───setup.py

doiExtractor.py - Contains the functions to extract the name and doi from portalcientifico.upm.es

openAlex.py - Contains the functions to extract the primary location from openAlex

Installation

  1. Clone the repository: git clone https://github.com/ptorija/DOI-Extractor-OEG.git

  2. Change to the DOI-Extractor-OEG directory: cd DOI-Extractor-OEG

  3. Create a virtual environment: python -m venv .env

  4. Activate the virtual environment: source .env/bin/activate (Linux) or .env\Scripts\activate (Windows)

  5. Install the package dependencies: pip install -e .

Usage

The tool can be used from the command line with the following argument:

  • --start - To start the doi extraction

The script will execute and extract DOIs from the specified webpage and then merge them with the ones from ExistingPapers.

Options:

  • --url <path> - Specify the webpage of the group you want to extract the dois. Default: Ontology Engieniering Group
  • --output <path> - Specify the path for the output files. Default: Outputs/

Example

  • DataExtractorOEG --start

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataextractoroeg-0.3.1.tar.gz (6.5 kB view hashes)

Uploaded Source

Built Distribution

DataExtractorOEG-0.3.1-py3-none-any.whl (7.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page