Skip to main content

HCA Ingest Service neo4j graph validator package

Project description

HCA Ingest Service Graph Validation Suite

What is this useful for in the scope of the HCA:

  1. Enables data wranglers to visually analyze the relationships inside a submission to look for inconsistencies.
  2. Provides an automated graph validator for which to create tests using step 1 and can be run fully containerized.

Features

The suite is divided in two separate, extensible parts:

  • hydrators enable users to import and populate data into a graph database. The reason not to call them importers is import is a reserved keyword in Python and from importers import importer is a bit confusing. :dizzy_face:

  • actions provide different tools to work with the generated graph. The first and most important is to run a series of tests to validate the constraints Data Wranglers want to impose on submissions. Another action is generating reports and extracting statistics from the graph to send to the submitters. Any other actions can be implemented to extend the suite.

Functionality

So far, the functionality planned is as follows (WIP items are still not fully implemented):

  • Hydrators:

    • Ingest Service Spreadsheet.
    • Ingest Service API Submission.
    • BioSamples API (WIP).
  • Actions:

    • Opening an interactive visualizer to query the graph.
    • Running tests on the graph.
    • Generating reports for the graph (WIP).

Installation

The graph validator suite requires docker running in the host machine.

From the git repo

git clone git@github.com:HumanCellAtlas/ingest-graph-validator.git
cd ingest-graph-validator
pip install .

From PyPI

A Python package has been published in (PyPI)[https://pypi.org/project/ingest-graph-validator].

pip install ingest-graph-validator

Usage

Basic usage for data wranglers

ingest-graph-validator init
ingest-graph-validator hydrate xls <spreadsheet filename>

After the hydrator is done loading the data, point a browser to http://localhost:7474 to take a look at the graph.

More help

The Suite uses a CLI similar to git. Running a command without specifying anything else will show help for that command. At each level, the commands have different arguments and options. Running any subcommand with -h or --help with give you more information about it.

The root level commands are:

  • ingest-graph-validator init starts the database backend and enables a frontend visualizer to query the database, in http://localhost:7474 by default.

  • ingest-graph-validator hydrate shows the list of available hydrators.

  • ingest-graph-validator actions shows the list of available actions.

  • ingest-graph-validator shutdown stops the backend.

Containerized execution

WIP

Credits

This package was created with Cookiecutter.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for ingest-graph-validator, version 0.2.0
Filename, size File type Python version Upload date Hashes
Filename, size ingest_graph_validator-0.2.0-py2.py3-none-any.whl (16.9 kB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size ingest-graph-validator-0.2.0.tar.gz (13.7 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page