Skip to main content

A motif discovery package developed by Shao lab in SIBS, CAS

Project description


|pypi| |license|

.. |pypi| image::

.. |license| image::


MotifScan is a precise and easy-use motif discovery tool based on given motifs. To search for candidate motif targets
in given DNA sequences, the program scans them with a window of the same length as the motif, and defined raw motif
score of the sequence S in window as the ratio of the probability to observe target sequence S given the motif's
Position Weight Matrix (PWM) M and the probability to observe S given the genome background B. For each annotated motif,
we modeled the genome background distribution of motif score by randomly sampling the genome for 10^6 times, and defined
the targets of this motif as those candidates whose motif score was higher than the cutoff. And the enrichment of each
motif was represented by the ratio of motif target densities at input regions compared to random control regions,
together with a p-value calculated from hypergeometric distribution. It is noticeable that MotifScan is not a
de novo motif discovery tool.


To see the full documentation of MotifScan, please refer to:


The latest version release of MotifScan is available at
`PyPI <>`__:


$ pip install motifscan

MotifScan uses `setuptools <>`__ for installation from source code.
The source code of MotifScan is hosted on GitHub:

You can clone the repo and execute the following command under source directory:


$ python install


Build genomes

Before you use MotifScan, you need to build the prerequisites for corresponding genome assembly.


$ genomecompile [-h] [-v] -G sequences.fa -o output_dir

A directory contaning compiled genome sequence and information would be generated by this command.

**Note:** You only need run it once for each genome.

Build motif PWM (Optional)

**Note:** MotifScan provides some preprocessed motif PWM files under data/motif of the package.

IF you have some motifs that have not be included in our pre-complied motif collection, you need to compile on your own by using the following command.


$ motifcompile –M motif_pwm_demo.txt –g hg19_for_motifscan

-M motif raw matrix file

-g a pre-compiled genome directory generated by genomecompile

Motif raw matrix file should follow the format as below:

motif id and motif name are followed by a positive weighted matrix, and columns are seperated by tabs.
>MA0599.1 KLF5
1429 0 0 3477 0 5051 0 0 0 3915
2023 11900 12008 9569 13611 0 13611 13611 13135 5595
7572 0 0 0 0 5182 0 0 0 0
2587 1711 1603 565 0 3378 0 0 476 4101

Scan Motifs

Search regions (-p), compiled motif PWM (-m) and genome (-g) are required for a common MotifScan task.
And we recommend you to specify the gene annotation file (RefSeq) via option -t. With all these files prepared, use the following command to perform the program.


$ motifscan -p peaks.bed –m motif_pwm_demo.txt –g hg19_for_motifscan

**Note:** Using -h/--help for the details of all arguments.


`BSD 3-Clause
License <>`__

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for motifscan, version 1.1.2
Filename, size File type Python version Upload date Hashes
Filename, size motifscan-1.1.2.tar.gz (11.4 MB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page