Skip to main content

grobid2json

Project description

Grobid2Json

Extract the code to parse grobid xml into json from the s2orc-doc2json project and package it as a pypi package.

✨ Features

  • Process the XML files parsed by Grobid into JSON format.

📦 Installation

pip install grobid2json

🤯 Usage

from bs4 import BeautifulSoup
from grobid2json import convert_xml_to_json

file_path = "test.xml"
with open(file_path, "rb") as f:
    xml_data = f.read()
soup = BeautifulSoup(xml_data, "xml")
paper_id = file_path.split("/")[-1].split(".")[0]
paper = convert_xml_to_json(soup, paper_id, "")
json_data = paper.as_json()
print(json_data)

🔗 Links

Credits


📝 License

This project is Apache License 2.0 licensed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grobid2json-0.0.1.tar.gz (17.0 kB view details)

Uploaded Source

Built Distribution

grobid2json-0.0.1-py3-none-any.whl (17.4 kB view details)

Uploaded Python 3

File details

Details for the file grobid2json-0.0.1.tar.gz.

File metadata

  • Download URL: grobid2json-0.0.1.tar.gz
  • Upload date:
  • Size: 17.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.18

File hashes

Hashes for grobid2json-0.0.1.tar.gz
Algorithm Hash digest
SHA256 792395dcc5b082f863e2f6963eac9ad1b82d2b242c00d294cb830d6f97dae4d4
MD5 01bb985a7797309cb92797df799b6f0f
BLAKE2b-256 ebc4231af5f5a85571b0f1bdadf1669c03459bade846f955528acc295095fe50

See more details on using hashes here.

File details

Details for the file grobid2json-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: grobid2json-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 17.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.18

File hashes

Hashes for grobid2json-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b7d22662676ec5d96b2ae0d342db524b42bd871bca3198b71e538af90e1c9690
MD5 d1abe5d1d9d5236dd6c4f84ce32e0c5d
BLAKE2b-256 5dcb76e62203e00694819fc961e23e826e58c175ada9432586da7fb1b02f2ef0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page