Skip to main content

Synapse flat file validation and processing pipeline

Project description

Synapse Genie

Introduction

This package can deploy a AACR GENIE like project on Synapse and perform validation and processing of files.

Installation

Dependencies:

  • Python 3.6 or higher
  • synapseclient (pip install synapseclient)
  • Python pandas (pip install pandas)
pip install synapsegenie
synapsegenie -v

Usage

Creating your own registry

Please view the example registry to learn how to utilize synapsegenie. synapsegenie allows a user to create a registry package with a list of file formats. Each of these file format classes should extend synapsegenie.example_filetype_format.FileTypeFormat. Learn more about creating Python packages here. Once you have installed your registry package, you can now use the synapsegenie command line client.

synapsegenie Synapse project

A synapsegenie Synapse project must exist for you to fully utilize this package. There is now a command to create this infrastructure in Synapse. If you already have an existing Synapse Project you would like to use, please use the --project_id parameter, otherwise please use the --project_name parameter to create a new Synapse project.

synapsegenie bootstrap-infra --format_registry_packages example_registry \
                             --project_name "My Project Name"
                             --centers AAA BBB CCC

If you decide to add centers at a later date, you can re-run this command and the center will be added

synapsegenie bootstrap-infra --format_registry_packages example_registry \
                             --project_id syn12345
                             --centers AAA BBB CCC DDD

File Validator

The synapsegenie package also has a function to run the validator locally on all of your files. Please view the help to see how to run to validator.

synapsegenie validate-single-file -h

synapsegenie /path/to/file center_name \
             --format_registry_packages example_registry \
             --project_id syn12345 \ # Run bootstrap-infra to create a Synapse project

Validation/Processing

synapsegenie will validate and process all the files uploaded by centers. Every valid file will be processed and uploaded into Synapse tables.

synapsegenie process -h

# only validate
synapsegenie process --format_registry_packages example_registry \
                     --project_id syn12345
                     --only_validate

# validate + process
synapsegenie process --format_registry_packages example_registry \
                     --project_id syn12345

Contributing

To learn how to contribute, please read the contributing guide

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synapsegenie-0.0.2.tar.gz (36.5 kB view details)

Uploaded Source

Built Distribution

synapsegenie-0.0.2-py3-none-any.whl (42.2 kB view details)

Uploaded Python 3

File details

Details for the file synapsegenie-0.0.2.tar.gz.

File metadata

  • Download URL: synapsegenie-0.0.2.tar.gz
  • Upload date:
  • Size: 36.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.10.4

File hashes

Hashes for synapsegenie-0.0.2.tar.gz
Algorithm Hash digest
SHA256 6aa0992903ef0479fd5e3034daec43e4e68b7493c487e5d7e81e27d4180b9121
MD5 4802bba7bf0b9d55fa29edb8caa314a0
BLAKE2b-256 4ef9f134f5a6599b955c5f862793eee91129b4f3f8def821d981c1354f83cf3b

See more details on using hashes here.

File details

Details for the file synapsegenie-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: synapsegenie-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 42.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.10.4

File hashes

Hashes for synapsegenie-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 17601cc8efb5c26a55d7a1c08ed5d0c6a4dee270433ce43ed859a33b074cc6bc
MD5 b849e80c7e0156104db737fb0023bec0
BLAKE2b-256 e195f3b920c92256d1a50c587b712ac94e38bce798ab9b47bd4ec2066be1790c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page