Deploy GROBID on AWS EC2
Project description
AWS GROBID Deploy
Deploy GROBID on AWS EC2 using Python.
Note: The deployed GROBID service is publicly available on the internet. It is best practice to always teardown the instance when not in use. Spinning up new instances is fast and easy.
Usage
import aws_grobid
# There are a few different pre-canned configurations available:
# Base GROBID service w/ CRF only models
# aws_grobid.GROBIDDeploymentConfigs.grobid_lite
# Base GROBID service w/ Deep Learning models
# aws_grobid.GROBIDDeploymentConfigs.grobid_full
# Software Mentions annotation service w/ Deep Learning models
# aws_grobid.GROBIDDeploymentConfigs.software_mentions
# Create a new GROBID instance and wait for it to be ready
# This generally takes about 6 minutes
# Instance is automatically torn down if the
# GROBID service is not available within 7 minutes
instance_details = aws_grobid.deploy_and_wait_for_ready(
deployment_config=aws_grobid.DeploymentConfigs.grobid_lite,
)
# You can also specify the instance type, region, tags, etc.
# instance_details = aws_grobid.deploy_and_wait_for_ready(
# deployment_config=aws_grobid.DeploymentConfigs.grobid_full,
# instance_type='c5.4xlarge',
# region='us-east-1',
# tags={'awsApplication': 'arn:...'},
# timeout=300, # 5 minutes
# )
# Use the instance to process a PDF file
# The API URL is available from:
# instance_details.api_url
# ...
# Teardown the instance when done
aws_grobid.terminate_instance(
region=instance_details.region,
instance_id=instance_details.instance_id
)
When providing an instance type that has GPUs available, we automatically pass the GPU flag to the GROBID service. This allows GROBID to utilize the GPU for processing, which can significantly speed up the extraction of information from documents.
Note: The first time you make a call to the GROBID service, it may take a minute or so to warm up the service. Subsequent calls will be much faster.
We additionally will automatically pick up .env controlled envionment variables. This is useful for setting the AWS_PROFILE or AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID environment variables.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aws_grobid-0.1.0.tar.gz.
File metadata
- Download URL: aws_grobid-0.1.0.tar.gz
- Upload date:
- Size: 19.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e849801ecf26524f9224ff29368eed2d5cd96a18a43beebeac8737963c56709
|
|
| MD5 |
e93a5fa81006da5f8fd3b3183db84d30
|
|
| BLAKE2b-256 |
b01d372b55902c4e63742f6c503ca3ec01183c79173ea03870d05ca367d5dfaf
|
Provenance
The following attestation bundles were made for aws_grobid-0.1.0.tar.gz:
Publisher:
ci.yml on evamaxfield/aws-grobid
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
aws_grobid-0.1.0.tar.gz -
Subject digest:
1e849801ecf26524f9224ff29368eed2d5cd96a18a43beebeac8737963c56709 - Sigstore transparency entry: 187980756
- Sigstore integration time:
-
Permalink:
evamaxfield/aws-grobid@ad9fd6abccb20928b9bb224e52036e023b921c26 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/evamaxfield
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@ad9fd6abccb20928b9bb224e52036e023b921c26 -
Trigger Event:
push
-
Statement type:
File details
Details for the file aws_grobid-0.1.0-py3-none-any.whl.
File metadata
- Download URL: aws_grobid-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d54a38366e736b9d995406159de409a2bad0c709523c6fc5d7a29df57197de48
|
|
| MD5 |
e4ce09d9636d1ad08876f9d7e5442a8e
|
|
| BLAKE2b-256 |
816f58c3da54468c3df72b49670071f1420c08ef01f5c99db9f27a76e9e4f7d5
|
Provenance
The following attestation bundles were made for aws_grobid-0.1.0-py3-none-any.whl:
Publisher:
ci.yml on evamaxfield/aws-grobid
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
aws_grobid-0.1.0-py3-none-any.whl -
Subject digest:
d54a38366e736b9d995406159de409a2bad0c709523c6fc5d7a29df57197de48 - Sigstore transparency entry: 187980758
- Sigstore integration time:
-
Permalink:
evamaxfield/aws-grobid@ad9fd6abccb20928b9bb224e52036e023b921c26 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/evamaxfield
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
ci.yml@ad9fd6abccb20928b9bb224e52036e023b921c26 -
Trigger Event:
push
-
Statement type: