Skip to main content

Pipelines for genomics analysis

Project description

SeqNado logo

SeqNado Pipeline

Pipeline based on snakemake to process ChIP-seq, ATAC-seq, RNA-seq and short read WGS data for SNP calling.

Installation

  1. Create a basic conda environment (with pip to install python packages) and activate it.

        conda create -n seqnado pip
        conda activate seqnado
    
  2. Install the pipeline. Three options:

    a) Install the package from pip (recommended)

        pip install seqnado
    

    b) Clone the repositry and install directly.

        git clone https://github.com/alsmith151/SeqNado.git
        cd SeqNado
        pip install .
    

    c) Install from GitHub directly

        pip install git+https://github.com/alsmith151/SeqNado.git
    
  3. If you intend to use a cluster e.g. SLURM add the path to the DRMAA interface to your .bashrc:

        # Access to the DRMAA library: https://en.wikipedia.org/wiki/DRMAA
        echo "export DRMAA_LIBRARY_PATH=/<full-path>/libdrmaa.so" >> ~/.bashrc
    
        # For CBRG users the command to use is:
        echo "export DRMAA_LIBRARY_PATH=/usr/lib64/libdrmaa.so" >> ~/.bashrc
    

Running the pipeline

  1. Setup project directory

    In the parent directory of desired the working directory run the following command:

        seqnado-config atac # ATAC-seq samples
        seqnado-config chip # ChIP-seq/ChIPMentation
        seqnado-config rna # RNA-seq - Not fully tested
        seqnado-config snp # snp calling - Not fully tested
    
    

    This will lead you through a series of questions which will create a new project directory, config file and a sample sheet for you to edit.

    cd into the newly made directory and inspect the config file.

  2. Copy or link fastq files into the fastq directory

    Copy:
    cp PATH_TO_FASTQ/example_R1.fastq.gz

    Symlink: Be sure to use the absolute path for symlinks i.e.
    ln -s /ABSOLUTE_PATH_TO_FASTQ/example_R1.fastq.gz

  3. Set-up sample sheet

    There are two options for preparing a sample sheet:

    a) Using seqnado-design

        seqnado-design atac fastq/* # ATAC-seq samples
        seqnado-design chip fastq/* # ChIP-seq/ChIPMentation
        seqnado-design rna fastq/* # RNA-seq - Not fully tested
        seqnado-design snp fastq/* # snp calling - Not fully tested
    
    

    If samples names match the following conventions then a sample sheet will be generated for your samples:

     ChIP-seq
    
     * samplename1_Antibody_R1.fastq.gz
     * samplename1_Antibody_R2.fastq.gz
     * samplename1_Input_1.fastq
     * samplename1_Input_2.fastq
    
     For ATAC-seq:
    
     * sample-name-1_R1.fastq.gz
     * sample-name-1_R2.fastq.gz
     * sample-name-1_1.fastq
     * sample-name-1_2.fastq
    
     For RNA-seq:
    
     * sample-name-1_R1.fastq.gz
     * sample-name-1_R2.fastq.gz
     * sample-name-1_1.fastq
     * sample-name-1_2.fastq  
    

    b) Using a custom sample sheet.

    This is useful for situations in which it can be difficult to appropriately compare IP and Input control samples.

    • For ChIP-seq samples you will need to create a csv or tsv file with the following columns:

      sample antibody fq1 fq2 control
      SAMPLE-NAME ANTIBODY SAMPLE-NAME_ANTIBODY_R1.fastq.gz SAMPLE-NAME_ANTIBODY_R2.fastq.gz CONTROL_SAMPLE_Input
    • For ATAC-seq, RNA-seq or SNP calling samples you will need to create a csv or tsv file with the following columns:

      sample fq1 fq2
      SAMPLE-NAME SAMPLE-NAME_R1.fastq.gz SAMPLE-NAME_R2.fastq.gz
  4. Running the pipeline

    All FASTQ files present in the directory will be processed by the pipeline in parallel and original FASTQ files will not be modified. If new FASTQ files are added to a pre-run pipeline, only the new files will be processed.

    After copying/linking FASTQ files into the working directory and configuring the copy of config_[assay].yml in the working directory for the current experiment, the pipeline can be run with:

    seqnado atac # ATAC-seq samples
    seqnado chip # ChIP-seq/ChIPMentation
    seqnado rna # RNA-seq - Not fully tested
    seqnado snp # snp calling - Not fully tested
    
    • To visualise which tasks will be performed by the pipeline before running.
      seqnado atac -c 1 --preset ss --dag | dot -Tpng > dag.png

    • If using all default settings (this will run on just the login node)
      seqnado atac -c NUMBER_OF_CORES

    • If you want to use the cluster (recommended)
      seqnado atac -c NUMBER_OF_CORES --preset ss

    • Avoiding network disconnections
      nohup seqnado atac make &

    Your processed data can be found in ./seqnado_output

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seqnado-0.1.3.tar.gz (69.4 MB view details)

Uploaded Source

Built Distribution

seqnado-0.1.3-py3-none-any.whl (60.0 kB view details)

Uploaded Python 3

File details

Details for the file seqnado-0.1.3.tar.gz.

File metadata

  • Download URL: seqnado-0.1.3.tar.gz
  • Upload date:
  • Size: 69.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.3

File hashes

Hashes for seqnado-0.1.3.tar.gz
Algorithm Hash digest
SHA256 a7f5b7ee35a65eeec8271a355cd2d928dad081bd9206c89e85ae7e9e81ae525b
MD5 ccd384cf0ae07ef8e2e281a8d910c97f
BLAKE2b-256 c3a43f7b6f357bb015a41b5b352726a422f1af28f85ea80c95fb18382fc6a36d

See more details on using hashes here.

File details

Details for the file seqnado-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: seqnado-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 60.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.3

File hashes

Hashes for seqnado-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 389a4d2c738e86ac5ba5f4113f13b1e76124c5da6780fb266c717af0931c5ba0
MD5 c15c4d8edbc2d7eed992d17f50175865
BLAKE2b-256 3b2bd34df655364d4a4a0e45cae3cab327793b5a06c9ea8e869e411f364b85d0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page