Skip to main content

The bioinformatics tool for Mirror-seq.

Project description

What is it
==========

Mirror-seq is a assay invented by `Zymo
Research <http://zymoresearch.com>`__ to detect
`hydroxymethylation <>`__ (hmc) in genomes using `bisulfite
sequencing <>`__. This analysis tool helps biologists to analyze
sequencing data. It takes Fastq files from sequencers and generate
hydroxymethylation ratio for each CpGs.

What's included
===============

We provide two way to do analysis. If you are new or like to do a quick
analysis, the **Qucik Start** below is the total solution you would
like. It set up the environment for you and has a simple workflow -
trimming, alignment, hydroxymethylation calling. Just type in one
command. You will find results in your workplace in several hours.

If you are bioinformatics expert or anyone eager to try different
parameters, we also provide the component of initial fill-in nucleotides
trimming () and the final hydroxymethylation calling () parts as
standalone program. You can plug in your favor QC and adapter trimming
software and alignment software with your homemade parameters. Please
fellow the **installation** section below for more details.

Quick Start
===========

We created a `Docker <>`__ image to solve the dependency problem and
scientists can use either Windows, MacOS, or Linux to run the analysis.

Install Docker
--------------

Find your OS and follow the installation instructions of
`Windows <https://docs.docker.com/windows/step_one/>`__,
`MacOS <https://docs.docker.com/mac/step_one/>`__, and
`Linux <https://docs.docker.com/linux/step_one/>`__ from Docker's
official website.

Run Mirror-seq
--------------

You need to create a workplace directory (``<YOUR WORKPLACE>``) and put
the following files inside: \* Read 1 and Read 2 Fastq files. \* Genome
index (We provide `human index <>`__. Unzip the file after downloading.)

::

docker run -it --rm -v <YOUR WORKPLACE>:/workplace \
zymoresearch/mirror-seq \
-1 <READ 1 FILENAME> -2 <READ 2 FILENAME> \
-g <GENOME INDEX FOLDER NAME> --bed

Notes:
------

Although it is super easy to run the analysis tool, there are several
things you need to know in order to run it smoothly. \* The alignment
part is memory intensive and CPU intensive.
`Bismark <http://www.bioinformatics.bbsrc.ac.uk/projects/bismark/>`__,
the aligner we used in our tool, suggests at least 5 cores and > 16GB of
RAM. \* Usually Fastq files are several GB even with compression. In the
first trimming part, the tool could need up to 3 times large as the
original input. Please make sure your workplace has enough storage
space.

installation
============

Dependencies
============

Python (2.7)
------------

- `NumPy <http://www.numpy.org/>`__: 1.7.0
- `pandas <http://pandas.pydata.org/>`__: 0.18.0
- `numexpr <https://github.com/pydata/numexpr>`__: 2.5.2
- `pysam <http://pysam.readthedocs.org/en/latest/api.html>`__: 0.9.0
- `cutadapt <http://cutadapt.readthedocs.org/en/stable/guide.html>`__:
1.9.1
- `PyTables <http://www.pytables.org/>`__: 3.2.2

Bioinformatics software
-----------------------

- `bedToBigBed <http://hgdownload.cse.ucsc.edu/admin/exe/>`__
- `bedSort <http://hgdownload.cse.ucsc.edu/admin/exe/>`__
- `Trim
Galore! <http://www.bioinformatics.bbsrc.ac.uk/projects/trim_galore/>`__:
0.3.7
- `bowtie2 <http://bowtie-bio.sourceforge.net/bowtie2/index.shtml>`__:
2.2.6
- `Bismark <http://www.bioinformatics.bbsrc.ac.uk/projects/bismark/>`__:
0.14.5

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mirror_seq-0.2.2.tar.gz (10.3 kB view details)

Uploaded Source

File details

Details for the file mirror_seq-0.2.2.tar.gz.

File metadata

  • Download URL: mirror_seq-0.2.2.tar.gz
  • Upload date:
  • Size: 10.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for mirror_seq-0.2.2.tar.gz
Algorithm Hash digest
SHA256 7833c6f147972f59693092ea409749d1cacb7353d16252fb22721bd1adf343f9
MD5 665ed462aeb8a325ba940ea97cd28da3
BLAKE2b-256 160f099fb248c2f2059f55b7342d199f5f1ac0d57ff79d8750f2c2f2e842d9b6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page