simple functions for manipulating sequences and secondary structures in pandas dataframe format
Project description
seq_tools
a short python tool for working with sequences in dataframes
how to install
pip install rna_seq_tools
how to use
seq_tools
is a python package that contains a few functions for working with sequences in
dataframes. If there is a single sequence results are printed. If input is a csv then a new csv is
created with the results. Default output is "output.csv" but can be changed with the -o
flag.
$ seq_tools --help
Usage: seq_tools [OPTIONS] COMMAND [ARGS]...
a set scripts to manipulate sequences in csv files
Options:
--help Show this message and exit.
Commands:
add add a sequence to 5' and/or 3'
ec calculate the extinction coefficient for each sequence
edit-distance calculate the edit distance of a library
fold fold rna sequences
mw calculate the molecular weight for each sequence
rc calculate reverse complement for each sequence
to-dna convert rna sequence(s) to dna
to-dna-template convert rna sequence(s) to dna template, includes T7...
to-fasta generate fasta file from csv
to-opool generate oligo pool file from csv
to-rna convert rna sequence(s) to dna
transcribe convert dna sequence(s) to rna
trim trim 5'/3' ends of sequences
add
Adds a sequence to the 5' and/or 3' end of a sequence.
$ seq_tools add -p5 "AAAA" "GGGGUUUUCCCC"
SEQ_TOOLS.get_input_dataframe - INFO - reading sequence GGGGUUUUCCCC
SEQ_TOOLS.handle_output - INFO - output->
name seq
sequence AAAAGGGGUUUUCCCC
Name: 0, dtype: object
ec
Calculate the extinction coefficient for each sequence.
$ seq-tools ec "GGGGUUUUCCCC"
SEQ_TOOLS.get_input_dataframe - INFO - reading sequence GGGGUUUUCCCC
SEQ_TOOLS.handle_ntype - INFO - determining nucleic acid type: RNA
SEQ_TOOLS.handle_output - INFO - output->
name seq
sequence GGGGUUUUCCCC
extinction_coeff 109500
Name: 0, dtype: object
edit-distance
Calculate the edit distance of a library. On average how different each sequence is from the rest of the library.
seq-tools edit-distance test/resources/test.csv
SEQ_TOOLS.edit_distance - INFO - edit distance: 17.666666666666668
fold
Fold rna sequences.
$ seq-tools fold "GGGGUUUUCCCC"
SEQ_TOOLS.get_input_dataframe - INFO - reading sequence GGGGUUUUCCCC
SEQ_TOOLS.handle_output - INFO - output->
name seq
sequence GGGGUUUUCCCC
structure ((((....))))
mfe -5.9
ens_defect 0.38
Name: 0, dtype: object
to-dna
Convert all sequences to DNA i.e. replace T with U.
$ seq_tools to-dna "GGGGUUUUCCCC"
SEQ_TOOLS.get_input_dataframe - INFO - reading sequence GGGGUUUUCCCC
SEQ_TOOLS.to_dna - INFO - converted sequence: GGGGTTTTCCCC
other non commandline
structure representation
from seq_tools import SequenceStructure
struct = SequenceStructure("GGGGUUUUCCCC", "((((....))))")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for rna_seq_tools-0.7.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 59da280274fa37e505f6a798cf2228f19d78946eb9624041b513267b495b2c6e |
|
MD5 | 4556b82a7765276454b99f3a1c3cb8e1 |
|
BLAKE2b-256 | 67cc1ac1f9e7312a434f609bac0afcda754e598f5d4c742d3ee74a661d2c676d |