Skip to main content

Basic utilities for working with nucleotide sequence strings.

Project description

🧬 streq

GitHub Workflow Status PyPI - Python Version PyPI

Python utilities for working with nucleotide sequence strings.

Installation

The easy way

Install the pre-compiled version from PyPI:

pip install streq

From source

Clone the repository, then cd into it. Then run:

pip install -e .

Usage

Streq provides various utility functions in Python for working with nucleotide sequences. Sequences can be upper or lower case, and case will be preserved through transformations.

Transformations

Reverse complement.

>>> import streq as sq
>>>
>>> sq.reverse_complement('ATCG')
'CGAT'

Convert between RNA and DNA alphabets.

>>> sq.to_rna('ATCG')
'AUCG'
>>> sq.to_dna('AUCG')
'ATCG'

Slice circular sequences such as plasmids or bacterial genomes.

>>> sq.Circular('ATCG')[-1:3]
'GATC'
>>> sq.reverse_complement(sq.Circular('ATCG'))[-1:3]
'CGAT'

Cases are preserved throughout the transformations.

>>> sq.reverse_complement(sq.Circular('ATCg'))
'cGAT'

Calculations

Get GC and pyrimidine content.

>>> sq.gc_content('AGGG')
0.75
>>> sq.pyrimidine_content('AUGGG')
0.2

Get autocorrelation (rough indicator for secondary structure).

>>> sq.correlation('AACC')
0.0
>>> sq.correlation('AAATTT')
2.3
>>> sq.correlation('AAATTCT')
1.3047619047619046
>>> sq.correlation('AAACTTT')
1.9238095238095236

Wobble base-pairing can be taken into account.

>>> correlation('GGGTTT')
0.0
>>> correlation('GGGTTT', wobble=True)
2.3
>>> correlation('GGGUUU', wobble=True)
2.3

Provide a second sequence to get correlation between sequences.

>>> sq.correlation('AAA', 'TTT')
0.0
>>> sq.correlation('AAA', 'AAA')
3.0

Distances

Calculate Levenshtein (insert, delete, mutate) distance.

>>> sq.levenshtein('AAATTT', 'AAATTT')
0
>>> sq.levenshtein('AAATTT', 'ACTTT')
2
>>> sq.levenshtein('AAAG', 'TCGA')
4

Calculate Hamming (mismatch) distance.

>>> sq.hamming('AAA', 'ATA')
1
>>> sq.hamming('AAA', 'ATT')
2
>>> sq.hamming('AAA', 'TTT')
3

Search

Search sequences using IUPAC symbols and iterate through the results.

>>> for (start, end), match in sq.find_iupac('ARY', 'AATAGCAGTGTGAAC'):
...     print(f"Found ARY at {start}:{end}: {match}")
... 
Found ARY at 0:3: AAT
Found ARY at 3:6: AGC
Found ARY at 6:9: AGT
Found ARY at 12:15: AAC

Find common Type IIS restriction sites:

>>> sq.which_re_sites('AAAGAAG')
()
>>> sq.which_re_sites('AAAGAAGAC')
('BbsI',)
>>> sq.which_re_sites('AAAGAAGACACCTGC')
('BbsI', 'PaqCI')

Documentation

Check the API here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

streq-0.0.4.tar.gz (10.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

streq-0.0.4-py3-none-any.whl (9.9 kB view details)

Uploaded Python 3

File details

Details for the file streq-0.0.4.tar.gz.

File metadata

  • Download URL: streq-0.0.4.tar.gz
  • Upload date:
  • Size: 10.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for streq-0.0.4.tar.gz
Algorithm Hash digest
SHA256 694b5146e6e36a9c65da3ee68627378865861b4e4930dd323e44528f1d935327
MD5 3216a1333f5c60608d1a3f220646a6e0
BLAKE2b-256 c80741f23cb60027b7ad5c9d2f3687fecc6b69c72dc26c744b58a1a39a8be831

See more details on using hashes here.

File details

Details for the file streq-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: streq-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 9.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for streq-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6b4996a8f084c8ef799b6a54ff78004b92fd11f8532e60ec51807f5c2b01e45a
MD5 eb474894bc79c4cb21eac681a4d3f37f
BLAKE2b-256 3c41ca5b6ca4ff39ea4718e4ef08099561c26ecea48567d09c209e9f6c3db793

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page