No project description provided
Project description
GWAS-SSF Tools
A basic toolkit for reading and formatting GWAS sumstats files from the GWAS Catalog. Built with:
There are two commands, read
and format
.
read
is for:
- Previewing a data file: no options
- Extracting the field headers:
-h
- Extracting all the metadata:
-M
- Extacting specific field, value pairs from the metada:
-m <field name>
format
is for:
- Converting a minamally formatted sumstats data file to the standard format. This is not guaranteed to return a valid standard file, because manadatory data fields could be missing in the input. It simply does the following.
-s
- Renames
variant_id
->rsid
- Reorders the fields
- Converts
NA
missing values to#NA
- It is memory efficient and will take approx. 30s per 1 million records
- Renames
- Generate metadata for a data file:
-m
- Read metadata in from existing file:
--meta-in <file>
- Create metadata from the GWAS Catalog (internal use, requires authenticated API):
-g
- Edit/add the values to the metadata:
-e
with--<FIELD>=<VALUE>
- Read metadata in from existing file:
##Usage
$ gwas-ssf [OPTIONS] COMMAND [ARGS]...
Options:
--help
: Show this message and exit.
Commands:
format
: Format a sumstats file and...read
: Read a sumstats file
gwas-ssf format
Format a sumstats file and creating a new one. Add/edit metadata.
Usage:
$ gwas-ssf format [OPTIONS] FILENAME
Arguments:
FILENAME
: Input sumstats file. Must be TSV or CSV and may be gzipped [required]
Options:
-o, --ss-out PATH
: Output sumstats file-s, --minimal2standard
: Try to convert a valid, minimally formatted file to the standard format.This assumes the file at least hasp_value
combined with rsid invariant_id
field orchromosome
andbase_pair_location
. Validity of the new file is not guaranteed because mandatory data could be missing from the original file. [default: False]-m, --generate-metadata
: Create the metadata file [default: False]--meta-out PATH
: Specify the metadata output file--meta-in PATH
: Specify a metadata file to read in-e, --meta-edit
: Enable metadata edit mode. Then provide params to edit in the--<FIELD>=<VALUE>
format e.g.--GWASID=GCST123456
to edit/add that value [default: False]-g, --meta-gwas
: Populate metadata from GWAS Catalog [default: False]-c, --custom-header-map
: Provide a custom header mapping using the--<FROM>:<TO>
format e.g.--chr:chromosome
[default: False]--help
: Show this message and exit.
gwas-ssf read
Read (preview) a sumstats file
Usage:
$ gwas-ssf read [OPTIONS] FILENAME
Arguments:
FILENAME
: Input sumstats file [required]
Options:
-h, --get-header
: Just return the headers of the file [default: False]--meta-in PATH
: Specify a metadata file to read in, defaulting to -meta.yaml-M, --get-all-metadata
: Return all metadata [default: False]-m, --get-metadata TEXT
: Get metadata for the specified fields e.g. `-m genomeAssembly -m isHarmonised--help
: Show this message and exit.
TODO:
- Installation/distribution docs
- Transformation features
- update GWAS API
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
gwas_sumstats_tools-0.1.0.tar.gz
(17.4 kB
view details)
Built Distribution
File details
Details for the file gwas_sumstats_tools-0.1.0.tar.gz
.
File metadata
- Download URL: gwas_sumstats_tools-0.1.0.tar.gz
- Upload date:
- Size: 17.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.7.4 Darwin/22.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | da90ff26d0020838c8e05c496b51e283bdba506d465aefca8c0f268d244bcbd7 |
|
MD5 | eb0113cdf0f2f837c13bdbe6972bdad4 |
|
BLAKE2b-256 | d34e19808007f0a31e892b65a43f4bf0a557b8a92f27841f327d08d77b02f7cf |
Provenance
File details
Details for the file gwas_sumstats_tools-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: gwas_sumstats_tools-0.1.0-py3-none-any.whl
- Upload date:
- Size: 19.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.7.4 Darwin/22.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f4cdf3f7ff64d55d87b8a8dae6257867c12c117bc0be56615dbf53eeaa960edd |
|
MD5 | 635ec5e227babb4b5a94bd7a6c90f388 |
|
BLAKE2b-256 | 901bfc4f7ef94b517ec7f5631e0fcc2442f9a45d20074442ffe4c5cf5ea7ace5 |