Skip to main content

Processing and validation for GENIE

Project description

genie banner

AACR Project GENIE

Docker Automated Docker Build

Introduction

This repository documents code used to gather, QC, standardize, and analyze data uploaded by institutes participating in AACR's Project GENIE (Genomics, Evidence, Neoplasia, Information, Exchange).

Dependencies

These are tools or packages you will need, to be able to reproduce these results:

File Validator

pip install aacrgenie
genie -v

This will install all the necessary components for you to run the validator locally on all of your files, including the Synapse client. Please view the help to see how to run to validator.

genie validate -h
genie validate data_clinical_supp_SAGE.txt SAGE

Development

Versioning

  1. Update the version in genie/version.py based on semantic versioning. Use the suffix -dev for development branch versions.
  2. When releasing, remove the -dev from the version.
  3. Add a tag and release named the same as the version.

SAGE BIONETWORKS USE ONLY

Batch Processing instructions

  1. Check docker hub builds to see if theres any failures
  2. Log into AWS Batch
  3. Run genie-job-mainprocess
  4. Run genie-job-mafprocess (Make sure to add --createdMafDatabase flag)
  5. Run genie-job-release (Make sure to update release version and number)

Processing on EC2

  1. Input to database: python input_to_database.py -h
  2. Create GENIE Files Example Releases a. release 4.1-consortium and 4.0-public
python database_to_staging.py Jan-2018 ~/cbioportal/ 4.1-consortium --skipMutationsInCis
python consortium_to_public.py Jul-2018 ~/cbioportal/ 4.0-public

b. release 5.1-consortium and 5.0-public

python database_to_staging.py Jul-2018 ~/cbioportal/ 5.1-consortium
python consortium_to_public.py Jan-2019 ~/cbioportal/ 5.0-public

Instructions to setup batch

  1. Build an AMI that can run batch jobs! Start from this page and follow instructions and specify your docker image. It is important at this stage that you time the building of your AMI, or your AMI will not be able to start batch jobs. After doing so, you will have to start an instance with the AMI and run these 2 commands:
sudo stop ecs
sudo rm -rf /var/lib/ecs/data/ecs_agent_data.json
  1. Rebuild the AMI above, specify the size of the image and put whatever you want in the instance that you would want to bind

Adding GENIE sites

  1. Invite users to GENIE participant Team
  2. Creates CENTER (input/staging) folder (Set up ACLs)
  3. Update Center Mapping table https://www.synapse.org/#!Synapse:syn10061452/tables/
  4. Add center to distribution tables: https://www.synapse.org/#!Synapse:syn10627220/tables/, https://www.synapse.org/#!Synapse:syn7268822/tables/
  5. Add users to their GENIE folder

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aacrgenie-12.2.0.tar.gz (118.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aacrgenie-12.2.0-py3-none-any.whl (143.5 kB view details)

Uploaded Python 3

File details

Details for the file aacrgenie-12.2.0.tar.gz.

File metadata

  • Download URL: aacrgenie-12.2.0.tar.gz
  • Upload date:
  • Size: 118.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for aacrgenie-12.2.0.tar.gz
Algorithm Hash digest
SHA256 ed4e711e172d0e2a67197341c8684ea40397afcdade2c22233dceb710bcc0aa2
MD5 dd04b99859f5a7a2e44cdfb50bdb78dc
BLAKE2b-256 184170104a8c038ba7b1a722b2c8e5c3c3698b482de22f010c259a66cf1e1cad

See more details on using hashes here.

File details

Details for the file aacrgenie-12.2.0-py3-none-any.whl.

File metadata

  • Download URL: aacrgenie-12.2.0-py3-none-any.whl
  • Upload date:
  • Size: 143.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for aacrgenie-12.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f9af0c6bfb5dff035e676d5472c1c40f85a2715807dbc0c660f57b3eea393961
MD5 9b92babef66bc86f4020af319539a093
BLAKE2b-256 fd54b0277b3fd5538773d016374f2105e3c163dc6ce83b2b5c163ba6edba018e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page