Skip to main content

No project description provided

Project description

CMSIP

Detecting differential 5hmC regions from CMS-IP sequencing data.

Source URL: https://github.com/lijinbio/cmsip

Workflow of CMSIP.

Installation

Dependencies

  • bsmap

bsmap is a component in the MOABS package. See more at MOABS (https://github.com/sunnyisgalaxy/moabs).

Example configuration file and description

sampleinfo:
  - sampleid: TKO2PE1b2
    group: tko
    filenames:
      - TKO2PE1b2_R1.fastq.gz
  - sampleid: TKO2PE2m
    group: tko
    filenames:
      - TKO2PE2b1_R1.fastq.gz
      - TKO2PE2b1_R2.fastq.gz
  - sampleid: WTPE1b2
    group: wt
    filenames:
      - WTPE1b2_R1.fastq.gz
  - sampleid: WTPE2b2
    group: wt
    filenames:
      - WTPE2b2_R1.fastq.gz
groupinfo:
  group1: tko
  group2: wt
resultdir: result
aligninfo:
  reference: /data/jin/resource/genome/fasta/hg38/hg38.fa.gz
  spikein: /data/jin/resource/genome/fasta/mm10/mm10.fa.gz
  fastqdir: test_data
  statfile: qcstats.txt
  barplotinfo:
    outfile: qcstats_twsn_barplot.pdf
    height: 5
    width: 5
  numthreads: 20
  verbose: True
genomescaninfo:
  readextension: True
  fragsize: 100
  windowfile: result/hg38_w200.bed
  referencename: hg38
  windowsize: 200
  readscount: False
  counttablefile: counttable.txt.gz
  verbose: True
dhmrinfo:
  method: 4
  mindepth: 5
  testfile: test.txt.gz
  qthr: 1.05
  maxdistance: 0
  dhmrfile: dhmr.txt.gz
  numthreads: 20
  verbose: True

sampleinfo

This block stores detailed metadata information of samples.

groupinfo

This block lists the interested comparison. The alternative hypothesis is true difference in means of group1 and group2 is less than 0.

aligninfo

Options and data information required for alignment.

  • reference

The FASTA file for the reference genome, such as hg38.fa.gz.

  • spikein

The FASTA file for the spike-in genome, such as mm10.fa.gz.

  • windowfile: hg38_w100.bed

The genome in window bins. This window bin file can be generated by using bedtools. E.g.

bedtools makewindows -g <(fetchChromSizes hg38) -w 100 > hg38_w100.bed
  • windowsize: 100

Window size for creating bins.

  • fastqdir: test_data

Root directory with raw FASTQ files.

  • outdir

Root output directory for temporary and final result files.

  • statfile

QC statistics file. Default is at outdir/qcstats.txt. If this file exists, QC step will be skipped, and size factors will be parsed for the existing QC statistical file. Otherwise, QC step will run to generate the statistics file.

  • cnttablefile

Region count table file. Default is at outdir/meancovtable.txt.gz. If this file exists, counting step will be skipped, and the existing count table file will be used for downstream statistical testing. Otherwise, counting step will execute to generate the count table file.

  • ttestfile

The statistical testing result file. Default is at outdir/t.test.txt. If this file exists, no more task will run. Otherwise, statistical testing will run on the count table using t-test.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cmsip-0.0.0.6.tar.gz (112.6 kB view hashes)

Uploaded Source

Built Distribution

cmsip-0.0.0.6-py3-none-any.whl (13.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page