Skip to main content

Pacifica Ingest

Project description

Pacifica Ingest Services

Build Status Build status Code Climate Test Coverage Issue Count

Frontend Stars Backend Stars Frontend Pulls Backend Pulls Frontend Automated build Backend Automated build

This is the Pacifica Ingest Services API.

This service receives, validates and processes data that is provided by a Pacifica Uploader service.

Installing the Service

Prerequisites

To run the code, the following commands are required:

The Manual Way

Install the dependencies using the pip command:

pip install -r requirements.txt

Build and install the code using the setup.py script:

python setup.py build
python setup.py install

Running the Service

Start and run a new instance using the docker-compose command:

docker-compose up

Bundle Format

The bundle format is parsed using the tarfile package from the Python standard library.

Both data and metadata are stored in a bundle. Metadata is stored in the metadata.txt file (JSON format). Data is stored in the data/ directory.

To display the contents of a bundle using the tar command:

tar -tf mybundle.tar

For example, the contents of mybundle.tar is:

data/mywork/project/proposal.doc
data/mywork/experiment/results.csv
data/mywork/experiment/results.doc
metadata.txt

API Examples

The endpoints that define the ingest process are as follows. The assumption is that the installer knows the IP address and port the WSGI service is listening on.

Ingest (Single HTTP Request)

Post a bundle (defined above) to the endpoint.

POST /ingest
... tar bundle as body ...

The response will be the job ID information as if you requested it directly.

{
  "job_id": 1234,
  "state": "OK",
  "task": "UPLOADING",
  "task_percent": "0.0",
  "updated": "2018-01-25 16:54:50",
  "created": "2018-01-25 16:54:50",
  "exception": ""
}

Failures that exist with this endpoint are during the course of uploading the bundle. Sending data to this endpoint should consider long drawn out HTTP posts that maybe longer than clients are used to handling.

Move (Single HTTP Request)

Post a metadata document to the endpoint.

POST /move
... content of move-md.json ...

The response will be the job ID information as if you requested it directly.

{
  "job_id": 1234,
  "state": "OK",
  "task": "UPLOADING",
  "task_percent": "0.0",
  "updated": "2018-01-25 16:54:50",
  "created": "2018-01-25 16:54:50",
  "exception": ""
}

Get State for Job

Using the job_id field from the HTTP response from an ingest.

GET /get_state?job_id=1234
{
  "job_id": 1234,
  "state": "OK",
  "task": "ingest files",
  "task_percent": "0.0",
  "updated": "2018-01-25 17:00:32",
  "created": "2018-01-25 16:54:50",
  "exception": ""
}

As the bundle of data is being processed errors may occure, if that happens the following will be returned. It is useful when consuming this endpoint to plan for failures. Consider logging or showing a message visable to the user that shows the ingest failed.

GET /get_state?job_id=1234
{
  "job_id": 1234,
  "state": "FAILED",
  "task": "ingest files",
  "task_percent": "0.0",
  "updated": "2018-01-25 17:01:02",
  "created": "2018-01-25 16:54:50",
  "exception": "... some crazy python back trace ..."
}

CLI Tools

There is an admin tool that consists of subcommands for manipulating ingest processes.

Job Subcommand

The job subcommand allows administrators to directly manipulate the state of a job. Due to complex computing environments some jobs may get "stuck" and get to a state where they aren't failed and aren't progressing. This may happen for any number of reasons but the solution is to manually fail the job.

IngestCMD job \
    --job-id 1234 \
    --state FAILED \
    --task 'ingest files' \
    --task-percent 0.0 \
    --exception 'Failed by adminstrator'

Contributions

Contributions are accepted on GitHub via the fork and pull request workflow. GitHub has a good help article if you are unfamiliar with this method of contributing.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pacifica-ingest-0.2.0.tar.gz (36.5 kB view details)

Uploaded Source

Built Distributions

pacifica_ingest-0.2.0-py3-none-any.whl (21.8 kB view details)

Uploaded Python 3

pacifica_ingest-0.2.0-py2-none-any.whl (21.8 kB view details)

Uploaded Python 2

File details

Details for the file pacifica-ingest-0.2.0.tar.gz.

File metadata

  • Download URL: pacifica-ingest-0.2.0.tar.gz
  • Upload date:
  • Size: 36.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for pacifica-ingest-0.2.0.tar.gz
Algorithm Hash digest
SHA256 eec40e78da11d2730efc1db9ca560fbd781fc91f3bd1b9c0d570ef00d78edab1
MD5 85fe05de14ed19212187fecf78902e89
BLAKE2b-256 7bc71b88d0aea79ec5a39944cfbb0e2d1707036356af0d662dd2ab5eefe522e1

See more details on using hashes here.

File details

Details for the file pacifica_ingest-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pacifica_ingest-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 21.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.3

File hashes

Hashes for pacifica_ingest-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 42bc2ae0dd2b792af6f0719b8f40cf6eeb55b079633108fffa945e8cf7a51664
MD5 ab5f83f7bc877777c13d2e1ac17b04a6
BLAKE2b-256 0eb03f86b2c5e19a641999450cbe2b75521b8bab562c6fa47f958acfdb6076ec

See more details on using hashes here.

File details

Details for the file pacifica_ingest-0.2.0-py2-none-any.whl.

File metadata

  • Download URL: pacifica_ingest-0.2.0-py2-none-any.whl
  • Upload date:
  • Size: 21.8 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/2.7.14

File hashes

Hashes for pacifica_ingest-0.2.0-py2-none-any.whl
Algorithm Hash digest
SHA256 7de7165a8656fa5fa4fe664f4c022521512d341efa4d104f9c8232824e705687
MD5 8089cd1b8399b3f9773ced994390b2b2
BLAKE2b-256 1eeb1f51adc9eaad41498de45b1cf7857c35f189e898e04f5e71f5c57f97ff03

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page