A motif discovery package developed by Shao lab in SIBS, CAS
Project description
MotifScan
=========
|pypi| |license|
.. |pypi| image:: https://img.shields.io/pypi/v/motifscan.svg
:target: https://pypi.python.org/pypi/motifscan
.. |license| image:: https://img.shields.io/pypi/l/MAnorm.svg
:target: https://github.com/shao-lab/MAnorm/blob/master/LICENSE
Introduction
------------
MotifScan is a precise and easy-use motif discovery tool based on given motifs. To search for candidate motif targets
in given DNA sequences, the program scans them with a window of the same length as the motif, and defined raw motif
score of the sequence S in window as the ratio of the probability to observe target sequence S given the motif's
Position Weight Matrix (PWM) M and the probability to observe S given the genome background B. For each annotated motif,
we modeled the genome background distribution of motif score by randomly sampling the genome for 10^6 times, and defined
the targets of this motif as those candidates whose motif score was higher than the cutoff. And the enrichment of each
motif was represented by the ratio of motif target densities at input regions compared to random control regions,
together with a p-value calculated from hypergeometric distribution. It is noticeable that MotifScan is not a
de novo motif discovery tool.
Documentation
-------------
To see the full documentation of MotifScan, please refer to: http://bioinfo.sibs.ac.cn/shaolab/motifscan/index.php
Installation
------------
The latest version release of MotifScan is available at
`PyPI <https://pypi.python.org/pypi/motifscan>`__:
::
$ pip install motifscan
MotifScan uses `setuptools <https://setuptools.readthedocs.io/en/latest/>`__ for installation from source code.
The source code of MotifScan is hosted on GitHub: https://github.com/shao-lab/MotifScan
You can clone the repo and execute the following command under source directory:
::
$ python setup.py install
Usage
-----
Build genomes
^^^^^^^^^^^^^
Before you use MotifScan, you need to build the prerequisites for corresponding genome assembly.
::
$ genomecompile [-h] [-v] -G sequences.fa -o output_dir
A directory contaning compiled genome sequence and information would be generated by this command.
**Note:** You only need run it once for each genome.
Build motif PWM (Optional)
^^^^^^^^^^^^^^^^^^^^^^^^^^
**Note:** MotifScan provides some preprocessed motif PWM files under data/motif of the package.
IF you have some motifs that have not be included in our pre-complied motif collection, you need to compile on your own by using the following command.
::
$ motifcompile –M motif_pwm_demo.txt –g hg19_for_motifscan
-M motif raw matrix file
-g a pre-compiled genome directory generated by genomecompile
Motif raw matrix file should follow the format as below:
motif id and motif name are followed by a positive weighted matrix, and columns are seperated by tabs.
::
>MA0599.1 KLF5
1429 0 0 3477 0 5051 0 0 0 3915
2023 11900 12008 9569 13611 0 13611 13611 13135 5595
7572 0 0 0 0 5182 0 0 0 0
2587 1711 1603 565 0 3378 0 0 476 4101
Scan Motifs
^^^^^^^^^^^
Search regions (-p), compiled motif PWM (-m) and genome (-g) are required for a common MotifScan task.
And we recommend you to specify the gene annotation file (RefSeq) via option -t. With all these files prepared, use the following command to perform the program.
::
$ motifscan -p peaks.bed –m motif_pwm_demo.txt –g hg19_for_motifscan
**Note:** Using -h/--help for the details of all arguments.
License
-------
`BSD 3-Clause
License <https://github.com/shao-lab/MotifScan/blob/master/LICENSE>`__
=========
|pypi| |license|
.. |pypi| image:: https://img.shields.io/pypi/v/motifscan.svg
:target: https://pypi.python.org/pypi/motifscan
.. |license| image:: https://img.shields.io/pypi/l/MAnorm.svg
:target: https://github.com/shao-lab/MAnorm/blob/master/LICENSE
Introduction
------------
MotifScan is a precise and easy-use motif discovery tool based on given motifs. To search for candidate motif targets
in given DNA sequences, the program scans them with a window of the same length as the motif, and defined raw motif
score of the sequence S in window as the ratio of the probability to observe target sequence S given the motif's
Position Weight Matrix (PWM) M and the probability to observe S given the genome background B. For each annotated motif,
we modeled the genome background distribution of motif score by randomly sampling the genome for 10^6 times, and defined
the targets of this motif as those candidates whose motif score was higher than the cutoff. And the enrichment of each
motif was represented by the ratio of motif target densities at input regions compared to random control regions,
together with a p-value calculated from hypergeometric distribution. It is noticeable that MotifScan is not a
de novo motif discovery tool.
Documentation
-------------
To see the full documentation of MotifScan, please refer to: http://bioinfo.sibs.ac.cn/shaolab/motifscan/index.php
Installation
------------
The latest version release of MotifScan is available at
`PyPI <https://pypi.python.org/pypi/motifscan>`__:
::
$ pip install motifscan
MotifScan uses `setuptools <https://setuptools.readthedocs.io/en/latest/>`__ for installation from source code.
The source code of MotifScan is hosted on GitHub: https://github.com/shao-lab/MotifScan
You can clone the repo and execute the following command under source directory:
::
$ python setup.py install
Usage
-----
Build genomes
^^^^^^^^^^^^^
Before you use MotifScan, you need to build the prerequisites for corresponding genome assembly.
::
$ genomecompile [-h] [-v] -G sequences.fa -o output_dir
A directory contaning compiled genome sequence and information would be generated by this command.
**Note:** You only need run it once for each genome.
Build motif PWM (Optional)
^^^^^^^^^^^^^^^^^^^^^^^^^^
**Note:** MotifScan provides some preprocessed motif PWM files under data/motif of the package.
IF you have some motifs that have not be included in our pre-complied motif collection, you need to compile on your own by using the following command.
::
$ motifcompile –M motif_pwm_demo.txt –g hg19_for_motifscan
-M motif raw matrix file
-g a pre-compiled genome directory generated by genomecompile
Motif raw matrix file should follow the format as below:
motif id and motif name are followed by a positive weighted matrix, and columns are seperated by tabs.
::
>MA0599.1 KLF5
1429 0 0 3477 0 5051 0 0 0 3915
2023 11900 12008 9569 13611 0 13611 13611 13135 5595
7572 0 0 0 0 5182 0 0 0 0
2587 1711 1603 565 0 3378 0 0 476 4101
Scan Motifs
^^^^^^^^^^^
Search regions (-p), compiled motif PWM (-m) and genome (-g) are required for a common MotifScan task.
And we recommend you to specify the gene annotation file (RefSeq) via option -t. With all these files prepared, use the following command to perform the program.
::
$ motifscan -p peaks.bed –m motif_pwm_demo.txt –g hg19_for_motifscan
**Note:** Using -h/--help for the details of all arguments.
License
-------
`BSD 3-Clause
License <https://github.com/shao-lab/MotifScan/blob/master/LICENSE>`__
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
motifscan-1.1.2.tar.gz
(11.4 MB
view hashes)