Synapse flat file validation and processing pipeline
Project description
Synapse Genie
Introduction
This package can deploy a AACR GENIE like project on Synapse and perform validation and processing of files.
Installation
Dependencies:
- Python 3.6 or higher
- synapseclient (
pip install synapseclient
) - Python pandas (
pip install pandas
)
pip install synapsegenie
synapsegenie -v
Usage
Creating your own registry
Please view the example registry to learn how to utilize synapsegenie
. synapsegenie
allows a user to create a registry package with a list of file formats. Each of these file format classes should extend synapsegenie.example_filetype_format.FileTypeFormat
. Learn more about creating Python packages here. Once you have installed your registry package, you can now use the synapsegenie
command line client.
synapsegenie Synapse project
A synapsegenie
Synapse project must exist for you to fully utilize this package. There is now a command to create this infrastructure in Synapse. If you already have an existing Synapse Project you would like to use, please use the --project_id
parameter, otherwise please use the --project_name
parameter to create a new Synapse project.
synapsegenie bootstrap-infra --format_registry_packages example_registry \
--project_name "My Project Name"
--centers AAA BBB CCC
If you decide to add centers at a later date, you can re-run this command and the center will be added
synapsegenie bootstrap-infra --format_registry_packages example_registry \
--project_id syn12345
--centers AAA BBB CCC DDD
File Validator
The synapsegenie
package also has a function to run the validator locally on all of your files. Please view the help to see how to run to validator.
synapsegenie validate-single-file -h
synapsegenie /path/to/file center_name \
--format_registry_packages example_registry \
--project_id syn12345 \ # Run bootstrap-infra to create a Synapse project
Validation/Processing
synapsegenie
will validate and process all the files uploaded by centers. Every valid file will be processed and uploaded into Synapse tables.
synapsegenie process -h
# only validate
synapsegenie process --format_registry_packages example_registry \
--project_id syn12345
--only_validate
# validate + process
synapsegenie process --format_registry_packages example_registry \
--project_id syn12345
Contributing
To learn how to contribute, please read the contributing guide
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file synapsegenie-0.0.2.tar.gz
.
File metadata
- Download URL: synapsegenie-0.0.2.tar.gz
- Upload date:
- Size: 36.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6aa0992903ef0479fd5e3034daec43e4e68b7493c487e5d7e81e27d4180b9121 |
|
MD5 | 4802bba7bf0b9d55fa29edb8caa314a0 |
|
BLAKE2b-256 | 4ef9f134f5a6599b955c5f862793eee91129b4f3f8def821d981c1354f83cf3b |
File details
Details for the file synapsegenie-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: synapsegenie-0.0.2-py3-none-any.whl
- Upload date:
- Size: 42.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 17601cc8efb5c26a55d7a1c08ed5d0c6a4dee270433ce43ed859a33b074cc6bc |
|
MD5 | b849e80c7e0156104db737fb0023bec0 |
|
BLAKE2b-256 | e195f3b920c92256d1a50c587b712ac94e38bce798ab9b47bd4ec2066be1790c |