Skip to main content

Common helper modules shared by the pipeline services

Project description

LIBDRM Package

Python based modules representing the core functionality of the SMDRM application.

Currently, there are four core modules:

For more info, go to the source code on GitHub.

Modules

APIs

This module holds information on available APIs to execute operations such as annotation and caching of data points.

It contains a lookup table with url instructions to contact them. Finally, it implements functions to check/wait on their current statuses. The latter is particularly useful during SMDRM application microservice start up.

Elastic

This module contains a custom ElasticSearch Client that performs the following API operations:

  • create index template
  • create/delete index
  • add document
  • add document batch

It also contains the ElasticSearch Template Mappings definition to set the data structure of indexed data points.

Pipeline

SMDRM Data Processing Pipeline enables the creation of ad hoc data processing pipeline with regard to the task at hand using a combination of Template and Bridge Design Patterns.

It also defines the required fields of the Disaster (data point) Model via the DisasterModel Class using Pydantinc.

A custom sequence of pipeline Steps can be used to make a pipeline to process input data.

The most important steps include:

  • read zip files
  • read JSON files (inside the zip file)
  • validate/parse JSON file content
  • read JSON file content in batches (useful for annotation)

Schemas

This module uses Marshmallow data model schemas to validate the uploaded zip file and metadata. It ensures the validity of the user input data, preventing any invalid data to enter the data processing pipeline.

Publish

For more info, check the publish.yml GitHub Action.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

libdrm-0.1.2.tar.gz (13.5 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page