Skip to main content

Quantum motif search and discovery using Grover's algorithm over genomic FASTA sequences.

Project description

MotifQu

MotifQu is a quantum motif search and discovery tool using Grover's algorithm. It provides two main functions:

  1. Motif Search: Find occurrences of a known motif pattern in a genome
  2. Motif Discovery: Discover all significant motifs (k-mers) in a genome using quantum amplitude amplification

Install

pip install MotifQu

Usage

Discover Motifs (New!)

Find all significant k-mers in a genome:

# Discover 6-mers appearing at least 3 times
motifqu discover --fasta genome.fa -k 6 --min-count 3

# Discover 8-mers, show top 20 results
motifqu discover --fasta genome.fa -k 8 --min-count 2 --topk 20

# Ignore reverse complement
motifqu discover --fasta genome.fa -k 6 --min-count 3 --no-revcomp

Search for Specific Motif

# Exact match
motifqu search --fasta genome.fa --motif GTTGTTGGAGAAG --mismatches 0

# Allow 1 mismatch
motifqu search --fasta genome.fa --motif TATAAA --mismatches 1

List Known Biological Motifs

motifqu list-motifs

Expand IUPAC Pattern

# Expand E-box pattern (CANNTG)
motifqu expand CANNTG

Coordinate Output

MotifQu prints both:

  • 1-based inclusive coordinates: contig:start-end
  • 0-based half-open interval: [start,end)

These coordinates are relative to the FASTA sequence provided.

Biological Context

The quantum motif discovery tool is designed for:

  • Transcription Factor Binding Sites (TFBS) - identifying regulatory sequences
  • Repeat elements - finding tandem repeats and microsatellites
  • Conserved sequences - detecting evolutionarily preserved patterns

The algorithm uses Grover's search to amplify the probability of significant k-mers (those appearing >= threshold times), providing a quadratic speedup over classical enumeration for the 4^k k-mer search space.

Notes

  • For discovery, k-mer lengths 4-10bp are recommended (4^k states require 2k qubits)
  • The oracle is built from classical pre-computation of k-mer counts
  • Reverse complement is counted as the same motif by default (biological DNA is double-stranded)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

motifqu-1.0.1.tar.gz (15.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

motifqu-1.0.1-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file motifqu-1.0.1.tar.gz.

File metadata

  • Download URL: motifqu-1.0.1.tar.gz
  • Upload date:
  • Size: 15.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for motifqu-1.0.1.tar.gz
Algorithm Hash digest
SHA256 c1e5c4cb4a73c87eee737369d30e45ade213d54550de212cbfbac801ce701ed5
MD5 21670716095ed59256bec6f2438e9f33
BLAKE2b-256 36700c4c4f1c3b7dc2ae787be1fb15cb12da20ca447b9a832bd18b6e5176d58a

See more details on using hashes here.

File details

Details for the file motifqu-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: motifqu-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 17.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for motifqu-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 66e0f5242ddc55bea942582b3833366fe127b8957d6a65460e610e0fbb9f3999
MD5 00423becee11895f690619f3ff332ccf
BLAKE2b-256 58b61b10939344980c69b20f54768e64c677639123f88eed88f1c33c06a589a3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page