Skip to main content

An SQL-based solution for large-scale genomic analysis

Project description

pysequila is Python entrypoint to SeQuiLa, an ANSI-SQL compliant solution for efficient sequencing reads processing and genomic intervals querying built on top of Apache Spark. Range joins, depth of coverage and pileup computations are bread and butter for NGS analysis but the high volume of data make them execute very slowly or even failing to compute.

Requirements

  • Python 3.7

Features

  • custom data sources for bioinformatics file formats (BAM, CRAM, VCF)

  • depth of coverage calculations

  • pileup calculations

  • reads filtering

  • efficient range joins

  • other utility functions

Setup

$ python -m pip install --user pysequila
or
(venv)$ python -m pip install pysequila

Usage

$ python
>>> from pyspark.sql import SparkSession
>>> spark = SparkSession \
    .builder \
    .appName(f'{app_name}') \
    .getOrCreate()
>>> from sequila import SequilaSession
>>> ss = SequilaSession(spark)
>>> ss.sql ("SELECT * FROM  coverage('reads', 'NA12878'")
>>>

ChangeLog

0.1.0 (2020-09-16)

  • Initial release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysequila-0.1.2.tar.gz (4.9 kB view details)

Uploaded Source

Built Distribution

pysequila-0.1.2-py2.py3-none-any.whl (3.1 kB view details)

Uploaded Python 2Python 3

File details

Details for the file pysequila-0.1.2.tar.gz.

File metadata

  • Download URL: pysequila-0.1.2.tar.gz
  • Upload date:
  • Size: 4.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.9

File hashes

Hashes for pysequila-0.1.2.tar.gz
Algorithm Hash digest
SHA256 e861ecaadc88f5cd9b640e8cc37c2266b5ae076cd3dd51030ea5730f35952572
MD5 d400dc474f7e056e29cbfb072633a4db
BLAKE2b-256 0ee02ad593488f401121bfb3f789ffceb0bc3f453af4568b04956e6fed9f9ede

See more details on using hashes here.

File details

Details for the file pysequila-0.1.2-py2.py3-none-any.whl.

File metadata

  • Download URL: pysequila-0.1.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 3.1 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.9

File hashes

Hashes for pysequila-0.1.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 ea938b0a3dd6396054429adb32428bd1840f5a304daeb9359df7f91c307c6e64
MD5 e8e02fa8273d82a5d324e9bb6e2bd99b
BLAKE2b-256 3fa8289489ff76e8e9b67c31ac792caad8ccf5e781cdf6e21cd917a80cee0833

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page