Skip to main content

Quantum motif search and discovery using Grover's algorithm over genomic FASTA sequences.

Project description

MotifQu

MotifQu is a quantum motif search and discovery tool using Grover's algorithm. It provides two main functions:

  1. Motif Search: Find occurrences of a known motif pattern in a genome
  2. Motif Discovery: Discover all significant motifs (k-mers) in a genome using quantum amplitude amplification

Install

pip install MotifQu

Usage

Discover Motifs (New!)

Find all significant k-mers in a genome:

# Discover 6-mers appearing at least 3 times
motifqu discover --fasta genome.fa -k 6 --min-count 3

# Discover 8-mers, show top 20 results
motifqu discover --fasta genome.fa -k 8 --min-count 2 --topk 20

# Ignore reverse complement
motifqu discover --fasta genome.fa -k 6 --min-count 3 --no-revcomp

Search for Specific Motif

# Exact match
motifqu search --fasta genome.fa --motif GTTGTTGGAGAAG --mismatches 0

# Allow 1 mismatch
motifqu search --fasta genome.fa --motif TATAAA --mismatches 1

List Known Biological Motifs

motifqu list-motifs

Expand IUPAC Pattern

# Expand E-box pattern (CANNTG)
motifqu expand CANNTG

Coordinate Output

MotifQu prints both:

  • 1-based inclusive coordinates: contig:start-end
  • 0-based half-open interval: [start,end)

These coordinates are relative to the FASTA sequence provided.

Biological Context

The quantum motif discovery tool is designed for:

  • Transcription Factor Binding Sites (TFBS) - identifying regulatory sequences
  • Repeat elements - finding tandem repeats and microsatellites
  • Conserved sequences - detecting evolutionarily preserved patterns

The algorithm uses Grover's search to amplify the probability of significant k-mers (those appearing >= threshold times), providing a quadratic speedup over classical enumeration for the 4^k k-mer search space.

Notes

  • For discovery, k-mer lengths 4-10bp are recommended (4^k states require 2k qubits)
  • The oracle is built from classical pre-computation of k-mer counts
  • Reverse complement is counted as the same motif by default (biological DNA is double-stranded)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

motifqu-1.0.0.tar.gz (15.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

motifqu-1.0.0-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file motifqu-1.0.0.tar.gz.

File metadata

  • Download URL: motifqu-1.0.0.tar.gz
  • Upload date:
  • Size: 15.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for motifqu-1.0.0.tar.gz
Algorithm Hash digest
SHA256 7679c560e91edfb8297813e194b197ecb8cb63269b24317fe5a64e9831385330
MD5 a6d47bb8eb23bc38c8349f3c6857d771
BLAKE2b-256 d989389c917c467c4e18867601833d0bc0961107c4afc74df070bb2edf5a130f

See more details on using hashes here.

File details

Details for the file motifqu-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: motifqu-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 17.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for motifqu-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c007a39c2f355dd09bfc49e6cd863d6454df666433738ab45a7ed2a083c5d0f4
MD5 ddafc694777213c47b2104aeeb91a266
BLAKE2b-256 d17792113cf8a869f484ff492a76a5fe0778cda6b910fd8c071510c79a3c8d46

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page