Skip to main content

GWAS summary statistics file validator

Project description

Summary Statistics TSV file Validator

A file validator for validating GWAS summary statistics TSV files prior to and post harmonisation using pandas_schema. The purpose is to validate files before their conversion to HDF5.

Installation

  • pip install ss-validate

Running the validator

To run the validator on a file:

  • ss-validate -f <file_to_validate.tsv> --logfile <logfile_name>

Information and errors are logged to the console and errors logged to the file specified. A console output might look like:

(INFO): Filename is good!
(INFO): Validating file...
(ERROR): Length of row 7 is: 16 instead of 15
(ERROR): Please fix the table. Some rows have different numbers of columns to the header
(INFO): Rows with different numbers of columns to the header are not validated
(ERROR): {row: 1, column: "p_value"}: "-99" was not in the range [0, 1)

The errors from the output tell us that row seven has too many columns and row one does not have a valid pvalue. If these rows are not fixed, they will later be dropped and not converted to HDF5.

Addional options

  • --drop-bad-lines : bool, default False

    Drops the the lines with errors from the file and writes it to a new file called <file_to_validate.tsv.valid>

  • --stage : {'standard', 'harmonised', 'curated'}, default 'standard'

    The stage the file is in. It is either standard format ('standard'), harmonised ('harmonised') or pre-standard in the custom curated format ('curated')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ss-validate-0.2.0.tar.gz (6.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ss_validate-0.2.0-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file ss-validate-0.2.0.tar.gz.

File metadata

  • Download URL: ss-validate-0.2.0.tar.gz
  • Upload date:
  • Size: 6.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.6.5

File hashes

Hashes for ss-validate-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e0878a38cdbea72b903a8ea6af99124c557fa85a08663ad0ca5d493f0d074e24
MD5 7269e569996fc8362be5f9b34e205882
BLAKE2b-256 a24680ef3c026154683f438049fa055b770526fde8d4b01779b3914874e7a8bc

See more details on using hashes here.

File details

Details for the file ss_validate-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: ss_validate-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.14.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.6.5

File hashes

Hashes for ss_validate-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b17dfebb39b72934f0ddfe10c4fe2d88258c5a286fa0322bbaa1260fc22712f7
MD5 9165135b7303ff6d523658e5e77d52ff
BLAKE2b-256 7f08cee528bda2c78ac907d52a1f8f6fd4b5eb9a824482e7b61ee19119885b1f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page