Pacifica Ingest
Project description
Pacifica Ingest Services
This is the Pacifica Ingest Services API.
This service receives, validates and processes data that is provided by a Pacifica Uploader service.
Installing the Service
Prerequisites
To run the code, the following commands are required:
- Docker Compose (the
docker-compose
command)
The Manual Way
Install the dependencies using the pip
command:
pip install -r requirements.txt
Build and install the code using the setup.py
script:
python setup.py build
python setup.py install
Running the Service
Start and run a new instance using the docker-compose
command:
docker-compose up
Bundle Format
The bundle format is parsed using the tarfile package from the Python standard library.
Both data and metadata are stored in a bundle. Metadata is stored in the
metadata.txt
file (JSON format). Data is stored in the data/
directory.
To display the contents of a bundle using the tar
command:
tar -tf mybundle.tar
For example, the contents of mybundle.tar
is:
data/mywork/project/proposal.doc
data/mywork/experiment/results.csv
data/mywork/experiment/results.doc
metadata.txt
API Examples
The endpoints that define the ingest process are as follows. The assumption is that the installer knows the IP address and port the WSGI service is listening on.
Ingest (Single HTTP Request)
Post a bundle (defined above) to the endpoint.
POST /ingest
... tar bundle as body ...
The response will be the job ID information as if you requested it directly.
{
"job_id": 1234,
"state": "OK",
"task": "UPLOADING",
"task_percent": "0.0",
"updated": "2018-01-25 16:54:50",
"created": "2018-01-25 16:54:50",
"exception": ""
}
Failures that exist with this endpoint are during the course of uploading the bundle. Sending data to this endpoint should consider long drawn out HTTP posts that maybe longer than clients are used to handling.
Move (Single HTTP Request)
Post a metadata document to the endpoint.
POST /move
... content of move-md.json ...
The response will be the job ID information as if you requested it directly.
{
"job_id": 1234,
"state": "OK",
"task": "UPLOADING",
"task_percent": "0.0",
"updated": "2018-01-25 16:54:50",
"created": "2018-01-25 16:54:50",
"exception": ""
}
Get State for Job
Using the job_id
field from the HTTP response from an ingest.
GET /get_state?job_id=1234
{
"job_id": 1234,
"state": "OK",
"task": "ingest files",
"task_percent": "0.0",
"updated": "2018-01-25 17:00:32",
"created": "2018-01-25 16:54:50",
"exception": ""
}
As the bundle of data is being processed errors may occure, if that happens the following will be returned. It is useful when consuming this endpoint to plan for failures. Consider logging or showing a message visable to the user that shows the ingest failed.
GET /get_state?job_id=1234
{
"job_id": 1234,
"state": "FAILED",
"task": "ingest files",
"task_percent": "0.0",
"updated": "2018-01-25 17:01:02",
"created": "2018-01-25 16:54:50",
"exception": "... some crazy python back trace ..."
}
CLI Tools
There is an admin tool that consists of subcommands for manipulating ingest processes.
Job Subcommand
The job subcommand allows administrators to directly manipulate the state of a job. Due to complex computing environments some jobs may get "stuck" and get to a state where they aren't failed and aren't progressing. This may happen for any number of reasons but the solution is to manually fail the job.
IngestCMD job \
--job-id 1234 \
--state FAILED \
--task 'ingest files' \
--task-percent 0.0 \
--exception 'Failed by adminstrator'
Contributions
Contributions are accepted on GitHub via the fork and pull request workflow. GitHub has a good help article if you are unfamiliar with this method of contributing.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file pacifica-ingest-0.1.1.tar.gz
.
File metadata
- Download URL: pacifica-ingest-0.1.1.tar.gz
- Upload date:
- Size: 36.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b5fba5f5b75df4081a020e14a07a33dac66c4576d9adb45111aef0fcaa228994 |
|
MD5 | e3b11be9f3e97cd9b0ff46f696ec7fe1 |
|
BLAKE2b-256 | b672881f33c68e514cc27ed841d3a11ec3ac9975f579f37f75394fb08b738d0b |
File details
Details for the file pacifica_ingest-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: pacifica_ingest-0.1.1-py3-none-any.whl
- Upload date:
- Size: 21.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0ebdf3a46c7c75c132e273ee3e03409e282757ff589bcceea30e13ccc3eb940f |
|
MD5 | 2743b0d4a4dc402014eb0f4fd0f7e389 |
|
BLAKE2b-256 | 572a5be9ff36819547ae4189760f9533543fc03673cd0dcae366b6c792b6c83f |
File details
Details for the file pacifica_ingest-0.1.1-py2-none-any.whl
.
File metadata
- Download URL: pacifica_ingest-0.1.1-py2-none-any.whl
- Upload date:
- Size: 21.8 kB
- Tags: Python 2
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/2.7.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ac9d42e49d74f62e27f0914e414802da58fb10ea7d9ec44a083bed294f5ccf78 |
|
MD5 | a8a36ac668c39ecc76fbca2b9702a492 |
|
BLAKE2b-256 | af416f4aff200d115d334d6f69fbb6277f1939da8d271d3d7201991f8ba51480 |