Skip to main content

grobid2json

Project description

Grobid2Json

Extract the code to parse grobid xml into json from the s2orc-doc2json project and package it as a pypi package.

✨ Features

  • Process the XML files parsed by Grobid into JSON format.

📦 Installation

pip install grobid2json

🤯 Usage

from bs4 import BeautifulSoup
from grobid2json import convert_xml_to_json

file_path = "test.xml"
with open(file_path, "rb") as f:
    xml_data = f.read()
soup = BeautifulSoup(xml_data, "xml")
paper_id = file_path.split("/")[-1].split(".")[0]
paper = convert_xml_to_json(soup, paper_id, "")
json_data = paper.as_json()
print(json_data)

🔗 Links

Credits


📝 License

This project is Apache License 2.0 licensed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grobid2json-0.0.1.tar.gz (17.0 kB view hashes)

Uploaded Source

Built Distribution

grobid2json-0.0.1-py3-none-any.whl (17.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page