Skip to main content

Seashell — Genomic data, compressed and queryable

Project description

Seashell CLI

Command-line tool for querying and managing genomic data on Seashell.

Install

pip install seashell-cli

Quick Start

seashell

You'll be prompted for your API key (from your institution admin), username, and password. After login, you're in an interactive shell. Every example below is typed directly at the seashell> prompt — copy any line and paste it.

LIST PATIENTS
FIND VARIANTS WHERE patient=NA12878 AND gene=BRCA1
EXPORT PATIENT NA12878 FORMAT CRAM

Single Query Mode

For one-off queries from a shell script, pass the query as a string argument:

seashell "FIND PATIENTS WHERE gene=BRCA1 AND significance=pathogenic"
seashell "COUNT VARIANTS WHERE patient=NA12878"
seashell --format json "LIST PATIENTS"

Commands

Command Description

Variant queries

Command Description
LIST PATIENTS List all patients in your institution
LIST GENES List all gene symbols
LIST ANNOTATIONS Loaded reference DB versions (gnomAD / dbSNP / constraint)
COUNT VARIANTS WHERE patient=NA12878 Count a patient's variants (sub-millisecond)
COUNT PATIENTS WHERE gene=BRCA1 Count patients matching criteria
FIND VARIANTS WHERE patient=NA12878 AND gene=BRCA1 Variants in a gene for one patient
FIND PATIENTS WHERE gene=BRCA1 AND significance=pathogenic Carriers of a variant
FIND SIMILAR TO patient=NA12878 Genetically similar patients
COMPARE PATIENTS NA12878 AND HG00096 Jaccard similarity between two patients
DIFF PATIENTS NA12878 AND HG00096 Exact variant differences
PCA PATIENTS Principal component analysis

Annotation queries (new in 0.1.7)

Filter variants by population frequency, predicted consequence, dbSNP membership, and gene constraint. Every variant is silently joined against gnomAD v4.1 + dbSNP + gnomAD constraint at query time. Sub-millisecond on the canonical thresholds.

Command Description
COUNT VARIANTS WHERE patient=NA12878 AND gnomad_af<0.001 Rare variants (sub-millisecond fast path)
COUNT VARIANTS WHERE patient=NA12878 AND gnomad_af<0.0001 Ultra-rare variants (sub-millisecond fast path)
COUNT VARIANTS WHERE patient=NA12878 AND lof=true Loss-of-function variants (sub-millisecond fast path)
COUNT VARIANTS WHERE patient=NA12878 AND consequence=missense_variant Missense variants (sub-millisecond fast path)
COUNT VARIANTS WHERE patient=NA12878 AND novel=true Variants not in dbSNP
FIND VARIANTS WHERE patient=NA12878 AND gnomad_af<0.001 AND consequence=missense_variant Rare missense — the canonical rare-disease query
FIND VARIANTS WHERE patient=NA12878 AND lof=true AND loeuf<0.35 High-impact LoF in constrained genes
FIND VARIANTS WHERE patient=NA12878 AND rsid=rs334 Lookup by dbSNP rsID

Filter keys: gnomad_af, gnomad_popmax, consequence, lof, impact, rsid, novel, pli, loeuf. All numeric filters support < > <= >= = !=. Result rows include gene, consequence, hgvs_c, hgvs_p, transcript, gnomad_af, gnomad_popmax, rsid, pli, and loeuf when annotations are loaded. See https://seashell.bio/docs (Developer → Annotation queries) for full reference.

Sequencing QC

Command Description
COVERAGE PATIENT id REGION chr:start-end Per-base read depth for a region: mean, min/max, and % above 10x/20x/30x
QC PATIENT id Combined read-stats summary (mapped/unmapped/duplicates/properly paired, mean MAPQ, mean insert size)
FLAGSTAT PATIENT id Read-flag summary report; output format matches samtools flagstat
INSERT_SIZE PATIENT id Insert-size distribution: pair count, mean, std, median, mode, MAD, percentiles
CYCLE_QUALITY PATIENT id Per-cycle base-quality decay across the read length
PILEUP PATIENT id POSITION chr:pos Per-base pileup at one position

Family genetics & cohort QC

Command Description
SEXCHECK PATIENT id Infer biological sex (XX / XY / XXY / X0 / XYY) from chrX and chrY normalized coverage
KINSHIP COHORT [UNEXPECTED] [LIMIT N] Pairwise relatedness pre-screen across the cohort
KINSHIP TRIO mom dad child Validate a declared family trio with sample-swap warnings
CONTAMINATION PATIENT id Sample-swap and contamination pre-screen (practical LoD ~3-5%)
ANCESTRY PATIENT id Predict super-population (EUR/EAS/SAS/AFR/AMR) against the bundled 1000 Genomes panel
MENDELIAN TRIO mom dad child De novo + inheritance partition for a declared trio

Data management

Command Description
UPLOAD PATIENT id CRAM s3://path.cram VCF s3://path.vcf.gz Upload pre-aligned CRAM/BAM
UPLOAD PATIENT id FASTQ s3://R1.fastq.gz s3://R2.fastq.gz Upload raw FASTQ
UPLOAD BATCH s3://manifest.json Batch upload from manifest
EXPORT PATIENT id FORMAT CRAM Export as CRAM (or BAM)
DELETE PATIENT id Remove a patient (admin only)
help Show all commands

Requirements

  • Python 3.8+
  • A Seashell API key (contact your institution admin)

Documentation

https://seashell.bio/docs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seashell_cli-0.1.24.tar.gz (31.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seashell_cli-0.1.24-py3-none-any.whl (31.5 kB view details)

Uploaded Python 3

File details

Details for the file seashell_cli-0.1.24.tar.gz.

File metadata

  • Download URL: seashell_cli-0.1.24.tar.gz
  • Upload date:
  • Size: 31.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for seashell_cli-0.1.24.tar.gz
Algorithm Hash digest
SHA256 19ec91b3f70d2302754584375648fa8b5737dc003f80b36792e18beb2f8e298b
MD5 e8fe70688e9e2697fdabf842c08ba9dd
BLAKE2b-256 f3b0a6eb20ea8d0e7a435a2b973001607d7dcc1c1460394fe1cbf43299aa657c

See more details on using hashes here.

File details

Details for the file seashell_cli-0.1.24-py3-none-any.whl.

File metadata

  • Download URL: seashell_cli-0.1.24-py3-none-any.whl
  • Upload date:
  • Size: 31.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for seashell_cli-0.1.24-py3-none-any.whl
Algorithm Hash digest
SHA256 15d46731d102219c7ea58a44ea581294818ee995a9df8cd35a88cc4be584ca12
MD5 ade9a95e9c3d63bd046f5561958e6459
BLAKE2b-256 0d92118387f4b0b1b598552db759eb34611ce46834a4e5fcabd85901d1076a9b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page