Skip to main content

A parser for Web of Science XML data in Python.

Project description

Web of Science XML Parser

A parser for Web of Science XML data in Python.

Installation

The package can be installed from PyPI:

pip install wos_parser

Getting Started

The parser can read *.xml files included in the Web of Science XML dataset. Note: The dataset is distributed as a collection of zipped archives (one for each record year), which in turn contain zipped versions of the xml files. These need to be unpacked first before passing them to the parser.

from wos_parser import WosParser


xml_path = "dataset/2023_CORE/WR_2023_20230111080536_CORE_0001.xml"

wos_parser = WosParser()

records = wos_parser.parse_records(xml_path)

Generating the documentation

To view the documentation, you currently have to build it locally. To do that, follow these steps:

  1. Clone the package repository.

  2. Install Sphinx.

  3. Install additional dependencies:

    pip install myst_parser pydata_sphinx_theme

  4. Go to the project folder's subdirectory doc/.

  5. Run make html.

  6. Open the file doc/_build/html/index.html in a browser.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wos_parser-0.1.0.tar.gz (11.8 kB view hashes)

Uploaded Source

Built Distribution

wos_parser-0.1.0-py3-none-any.whl (19.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page