Pipelines for genomics analysis

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

SeqNado logo

SeqNado Pipeline

Pipeline based on snakemake to process ChIP-seq, ATAC-seq, RNA-seq and short read WGS data for SNP calling.

Installation

Create a basic conda environment (with pip to install python packages) and activate it.
```
    conda create -n seqnado pip
    conda activate seqnado
```

Install the pipeline. Three options:

a) Install the package from pip (recommended)

    pip install seqnado

b) Clone the repositry and install directly.

    git clone https://github.com/alsmith151/SeqNado.git
    cd SeqNado
    pip install .

c) Install from GitHub directly

    pip install git+https://github.com/alsmith151/SeqNado.git

If you intend to use a cluster e.g. SLURM add the path to the DRMAA interface to your .bashrc:

    # Access to the DRMAA library: https://en.wikipedia.org/wiki/DRMAA
    echo "export DRMAA_LIBRARY_PATH=/<full-path>/libdrmaa.so" >> ~/.bashrc

    # For CBRG users the command to use is:
    echo "export DRMAA_LIBRARY_PATH=/usr/lib64/libdrmaa.so" >> ~/.bashrc

Running the pipeline

Setup project directory

In the parent directory of desired the working directory run the following command:
```
    seqnado-config atac # ATAC-seq samples
    seqnado-config chip # ChIP-seq/ChIPMentation
    seqnado-config rna # RNA-seq - Not fully tested
    seqnado-config snp # snp calling - Not fully tested
```
This will lead you through a series of questions which will create a new project directory, config file and a sample sheet for you to edit.

cd into the newly made directory and inspect the config file.
Copy or link fastq files into the fastq directory

Copy:
cp PATH_TO_FASTQ/example_R1.fastq.gz

Symlink: Be sure to use the absolute path for symlinks i.e.
ln -s /ABSOLUTE_PATH_TO_FASTQ/example_R1.fastq.gz
Set-up sample sheet

There are two options for preparing a sample sheet:

a) Using seqnado-design
```
    seqnado-design atac fastq/* # ATAC-seq samples
    seqnado-design chip fastq/* # ChIP-seq/ChIPMentation
    seqnado-design rna fastq/* # RNA-seq - Not fully tested
    seqnado-design snp fastq/* # snp calling - Not fully tested
```
If samples names match the following conventions then a sample sheet will be generated for your samples:
```
 ChIP-seq

 * samplename1_Antibody_R1.fastq.gz
 * samplename1_Antibody_R2.fastq.gz
 * samplename1_Input_1.fastq
 * samplename1_Input_2.fastq

 For ATAC-seq:

 * sample-name-1_R1.fastq.gz
 * sample-name-1_R2.fastq.gz
 * sample-name-1_1.fastq
 * sample-name-1_2.fastq

 For RNA-seq:

 * sample-name-1_R1.fastq.gz
 * sample-name-1_R2.fastq.gz
 * sample-name-1_1.fastq
 * sample-name-1_2.fastq  
```
b) Using a custom sample sheet.

This is useful for situations in which it can be difficult to appropriately compare IP and Input control samples.
- For ChIP-seq samples you will need to create a csv or tsv file with the following columns:
  
  sample antibody fq1 fq2 control
  
  SAMPLE-NAME ANTIBODY SAMPLE-NAME_ANTIBODY_R1.fastq.gz SAMPLE-NAME_ANTIBODY_R2.fastq.gz CONTROL_SAMPLE_Input
- For ATAC-seq, RNA-seq or SNP calling samples you will need to create a csv or tsv file with the following columns:
  
  sample fq1 fq2
  
  SAMPLE-NAME SAMPLE-NAME_R1.fastq.gz SAMPLE-NAME_R2.fastq.gz
Running the pipeline

All FASTQ files present in the directory will be processed by the pipeline in parallel and original FASTQ files will not be modified. If new FASTQ files are added to a pre-run pipeline, only the new files will be processed.

After copying/linking FASTQ files into the working directory and configuring the copy of config_[assay].yml in the working directory for the current experiment, the pipeline can be run with:
```
seqnado atac # ATAC-seq samples
seqnado chip # ChIP-seq/ChIPMentation
seqnado rna # RNA-seq - Not fully tested
seqnado snp # snp calling - Not fully tested
```
- To visualise which tasks will be performed by the pipeline before running.
  seqnado atac -c 1 --preset ss --dag | dot -Tpng > dag.png
- If using all default settings (this will run on just the login node)
  seqnado atac -c NUMBER_OF_CORES
- If you want to use the cluster (recommended)
  seqnado atac -c NUMBER_OF_CORES --preset ss
- Avoiding network disconnections
  nohup seqnado atac make &
Your processed data can be found in ./seqnado_output

sample	antibody	fq1	fq2	control
SAMPLE-NAME	ANTIBODY	SAMPLE-NAME_ANTIBODY_R1.fastq.gz	SAMPLE-NAME_ANTIBODY_R2.fastq.gz	CONTROL_SAMPLE_Input

sample	fq1	fq2
SAMPLE-NAME	SAMPLE-NAME_R1.fastq.gz	SAMPLE-NAME_R2.fastq.gz

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.5.1

Apr 23, 2024

0.5.0

Apr 1, 2024

0.4.3

Feb 27, 2024

0.4.2

Feb 27, 2024

0.4.1

Feb 8, 2024

0.4.0

Feb 7, 2024

0.3.9

Feb 5, 2024

0.3.8

Feb 3, 2024

0.3.7

Jan 29, 2024

0.3.6

Jan 26, 2024

0.3.5

Jan 18, 2024

0.3.2 yanked

Mar 31, 2023

Reason this release was yanked:

Incorrect

0.3.1 yanked

Mar 31, 2023

0.3.0 yanked

Mar 31, 2023

0.2.1

Aug 31, 2023

0.2.0

Aug 8, 2023

0.1.7

Jun 20, 2023

0.1.6

May 5, 2023

0.1.5

May 2, 2023

0.1.3

Apr 11, 2023

This version

0.1

Apr 5, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seqnado-0.1.tar.gz (69.4 MB view hashes)

Uploaded Apr 5, 2023 Source

Built Distribution

seqnado-0.1-py3-none-any.whl (59.3 kB view hashes)

Uploaded Apr 5, 2023 Python 3

Hashes for seqnado-0.1.tar.gz

Hashes for seqnado-0.1.tar.gz
Algorithm	Hash digest
SHA256	`bc0f25f4c6a7ce621f77ccce4add10d2e6761dad92cac9daf60dad82f2b90c1a`
MD5	`a2172206d31d1135b6fa61f39544f79c`
BLAKE2b-256	`2656bb91cbc116dc560da9bb90437048109b0da9e158d30f8ceac80e6a684484`

Hashes for seqnado-0.1-py3-none-any.whl

Hashes for seqnado-0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`082df4d7c00d9a69e10a3cb1b039dc1a61cecf1f3424109899f77814ae7b1c79`
MD5	`27e6734c1cb9f3a2322c618df89d64bb`
BLAKE2b-256	`fdbb3e63c0b61e721d343cec5b16c9eddc53dd7b4b8509fb8aa683bebc2b64f6`