Seashell — Genomic data, compressed and queryable
Project description
Seashell CLI
Command-line tool for querying and managing genomic data on Seashell.
Install
pip install seashell-cli
Quick Start
seashell
You'll be prompted for your API key (from your institution admin), username, and password. After login, you're in an interactive shell. Every example below is typed directly at the seashell> prompt — copy any line and paste it.
LIST PATIENTS
FIND VARIANTS WHERE patient=NA12878 AND gene=BRCA1
EXPORT PATIENT NA12878 FORMAT CRAM
Single Query Mode
For one-off queries from a shell script, pass the query as a string argument:
seashell "FIND PATIENTS WHERE gene=BRCA1 AND significance=pathogenic"
seashell "COUNT VARIANTS WHERE patient=NA12878"
seashell --format json "LIST PATIENTS"
Commands
| Command | Description |
|---|
Variant queries
| Command | Description |
|---|---|
LIST PATIENTS |
List all patients in your institution |
LIST GENES |
List all gene symbols |
LIST ANNOTATIONS |
Loaded reference DB versions (gnomAD / dbSNP / constraint) |
COUNT VARIANTS WHERE patient=NA12878 |
Count a patient's variants (sub-millisecond) |
COUNT PATIENTS WHERE gene=BRCA1 |
Count patients matching criteria |
FIND VARIANTS WHERE patient=NA12878 AND gene=BRCA1 |
Variants in a gene for one patient |
FIND PATIENTS WHERE gene=BRCA1 AND significance=pathogenic |
Carriers of a variant |
FIND SIMILAR TO patient=NA12878 |
Genetically similar patients |
COMPARE PATIENTS NA12878 AND HG00096 |
Jaccard similarity between two patients |
DIFF PATIENTS NA12878 AND HG00096 |
Exact variant differences |
PCA PATIENTS |
Principal component analysis |
Annotation queries (new in 0.1.7)
Filter variants by population frequency, predicted consequence, dbSNP membership, and gene constraint. Every variant is silently joined against gnomAD v4.1 + dbSNP + gnomAD constraint at query time. Sub-millisecond on the canonical thresholds.
| Command | Description |
|---|---|
COUNT VARIANTS WHERE patient=NA12878 AND gnomad_af<0.001 |
Rare variants (sub-millisecond fast path) |
COUNT VARIANTS WHERE patient=NA12878 AND gnomad_af<0.0001 |
Ultra-rare variants (sub-millisecond fast path) |
COUNT VARIANTS WHERE patient=NA12878 AND lof=true |
Loss-of-function variants (sub-millisecond fast path) |
COUNT VARIANTS WHERE patient=NA12878 AND consequence=missense_variant |
Missense variants (sub-millisecond fast path) |
COUNT VARIANTS WHERE patient=NA12878 AND novel=true |
Variants not in dbSNP |
FIND VARIANTS WHERE patient=NA12878 AND gnomad_af<0.001 AND consequence=missense_variant |
Rare missense — the canonical rare-disease query |
FIND VARIANTS WHERE patient=NA12878 AND lof=true AND loeuf<0.35 |
High-impact LoF in constrained genes |
FIND VARIANTS WHERE patient=NA12878 AND rsid=rs334 |
Lookup by dbSNP rsID |
Filter keys: gnomad_af, gnomad_popmax, consequence, lof, impact, rsid, novel, pli, loeuf. All numeric filters support < > <= >= = !=. Result rows include gene, consequence, hgvs_c, hgvs_p, transcript, gnomad_af, gnomad_popmax, rsid, pli, and loeuf when annotations are loaded. See https://seashell.bio/docs (Developer → Annotation queries) for full reference.
Sequencing QC
| Command | Description |
|---|---|
COVERAGE PATIENT id REGION chr:start-end |
Per-base read depth for a region: mean, min/max, and % above 10x/20x/30x |
QC PATIENT id |
Combined read-stats summary (mapped/unmapped/duplicates/properly paired, mean MAPQ, mean insert size) |
FLAGSTAT PATIENT id |
Read-flag summary report; output format matches samtools flagstat |
INSERT_SIZE PATIENT id |
Insert-size distribution: pair count, mean, std, median, mode, MAD, percentiles |
CYCLE_QUALITY PATIENT id |
Per-cycle base-quality decay across the read length |
PILEUP PATIENT id POSITION chr:pos |
Per-base pileup at one position |
Family genetics & cohort QC
| Command | Description |
|---|---|
SEXCHECK PATIENT id |
Infer biological sex (XX / XY / XXY / X0 / XYY) from chrX and chrY normalized coverage |
KINSHIP COHORT [UNEXPECTED] [LIMIT N] |
Pairwise relatedness pre-screen across the cohort |
KINSHIP TRIO mom dad child |
Validate a declared family trio with sample-swap warnings |
CONTAMINATION PATIENT id |
Sample-swap and contamination pre-screen (practical LoD ~3-5%) |
ANCESTRY PATIENT id |
Predict super-population (EUR/EAS/SAS/AFR/AMR) against the bundled 1000 Genomes panel |
MENDELIAN TRIO mom dad child |
De novo + inheritance partition for a declared trio |
Data management
| Command | Description |
|---|---|
UPLOAD PATIENT id CRAM s3://path.cram VCF s3://path.vcf.gz |
Upload pre-aligned CRAM/BAM |
UPLOAD PATIENT id FASTQ s3://R1.fastq.gz s3://R2.fastq.gz |
Upload raw FASTQ |
UPLOAD BATCH s3://manifest.json |
Batch upload from manifest |
EXPORT PATIENT id FORMAT CRAM |
Export as CRAM (or BAM) |
DELETE PATIENT id |
Remove a patient (admin only) |
help |
Show all commands |
Requirements
- Python 3.8+
- A Seashell API key (contact your institution admin)
Documentation
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file seashell_cli-0.1.24.tar.gz.
File metadata
- Download URL: seashell_cli-0.1.24.tar.gz
- Upload date:
- Size: 31.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
19ec91b3f70d2302754584375648fa8b5737dc003f80b36792e18beb2f8e298b
|
|
| MD5 |
e8fe70688e9e2697fdabf842c08ba9dd
|
|
| BLAKE2b-256 |
f3b0a6eb20ea8d0e7a435a2b973001607d7dcc1c1460394fe1cbf43299aa657c
|
File details
Details for the file seashell_cli-0.1.24-py3-none-any.whl.
File metadata
- Download URL: seashell_cli-0.1.24-py3-none-any.whl
- Upload date:
- Size: 31.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
15d46731d102219c7ea58a44ea581294818ee995a9df8cd35a88cc4be584ca12
|
|
| MD5 |
ade9a95e9c3d63bd046f5561958e6459
|
|
| BLAKE2b-256 |
0d92118387f4b0b1b598552db759eb34611ce46834a4e5fcabd85901d1076a9b
|