Skip to main content

A client library for accessing Grobid

Project description

grobid-client

A client library for accessing Grobid

Usage

First, create a client:

from grobid_client import Client

client = Client(base_url="https://cloud.science-miner.com/grobid/api")

Now call your endpoint and use your models:

from pathlib import Path
from grobid_client.api.pdf import process_fulltext_document
from grobid_client.models import Article, ProcessForm
from grobid_client.types import TEI, File
pdf_file = "MyPDFFile.pdf"
with pdf_file.open("rb") as fin:
    form = ProcessForm(
        segment_sentences="1",
        input_=File(file_name=pdf_file.name, payload=fin, mime_type="application/pdf),
    )
    r = process_fulltext_document.sync_detailed(client=client, multipart_data=form)
    if r.is_success:
        article: Article = TEI.parse(r.content, figures=False)
        assert article.title

Things to know:

  1. Every path/method combo becomes a Python module with four functions:

    1. sync: Blocking request that returns parsed data (if successful) or None
    2. sync_detailed: Blocking request that always returns a Request, optionally with parsed set if the request was successful.
    3. asyncio: Like sync but the async instead of blocking
    4. asyncio_detailed: Like sync_detailed by async instead of blocking
  2. All path/query params, and bodies become method arguments.

  3. If your endpoint had any tags on it, the first tag will be used as a module name for the function (my_tag above)

  4. Any endpoint which did not have a tag will be in entifyfishing_client.api.default

Building / publishing this Client

This project uses Poetry to manage dependencies and packaging. Here are the basics:

  1. Update the metadata in pyproject.toml (e.g. authors, version)
  2. If you're using a private repository, configure it with Poetry
    1. poetry config repositories.<your-repository-name> <url-to-your-repository>
    2. poetry config http-basic.<your-repository-name> <username> <password>
  3. Publish the client with poetry publish --build -r <your-repository-name> or, if for public PyPI, just poetry publish --build

If you want to install this client into another project without publishing it (e.g. for development) then:

  1. If that project is using Poetry, you can simply do poetry add <path-to-this-client> from that project
  2. If that project is not using Poetry:
    1. Build a wheel with poetry build -f wheel
    2. Install that wheel from the other project pip install <path-to-wheel>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grobid_client-0.8.8.tar.gz (24.3 kB view details)

Uploaded Source

Built Distribution

grobid_client-0.8.8-py3-none-any.whl (21.1 kB view details)

Uploaded Python 3

File details

Details for the file grobid_client-0.8.8.tar.gz.

File metadata

  • Download URL: grobid_client-0.8.8.tar.gz
  • Upload date:
  • Size: 24.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Linux/5.4.0-182-generic

File hashes

Hashes for grobid_client-0.8.8.tar.gz
Algorithm Hash digest
SHA256 4a844649bc170cd023c0cccd729babb3cd93686a1228afa53692ce9b41fd5728
MD5 f63a69350e0e40acbcc5eba5f20a9e47
BLAKE2b-256 86cbdb1010147b7f00b84ac52fab22161d56a8a6cc7cf6d74e6830c743e1d8c1

See more details on using hashes here.

File details

Details for the file grobid_client-0.8.8-py3-none-any.whl.

File metadata

  • Download URL: grobid_client-0.8.8-py3-none-any.whl
  • Upload date:
  • Size: 21.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.3 Linux/5.4.0-182-generic

File hashes

Hashes for grobid_client-0.8.8-py3-none-any.whl
Algorithm Hash digest
SHA256 5a5fb7993f951997912995d905c32906fe419592dcbc3d8b898043db37203caf
MD5 b40c85b4e6a4a5c324c89b60b27dbdcb
BLAKE2b-256 90b96c48d3530a664b8fae54774dd2c8ad0737da0d2ca567ba076e717486b437

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page