simple functions for manipulating sequences and secondary structures in pandas dataframe format
Project description
seq_tools
a short python tool for working with sequences in dataframes
how to install
pip install rna_seq_tools
how to use
seq_tools
is a python package that contains a few functions for working with sequences in
dataframes. If there is a single sequence results are printed. If input is a csv then a new csv is
created with the results. Default output is "output.csv" but can be changed with the -o
flag.
$ seq_tools --help
Usage: seq_tools [OPTIONS] COMMAND [ARGS]...
a set scripts to manipulate sequences in csv files
Options:
--help Show this message and exit.
Commands:
add add a sequence to 5' and/or 3'
ec calculate the extinction coefficient for each sequence
edit-distance calculate the edit distance of a library
fold fold rna sequences
mw calculate the molecular weight for each sequence
rc calculate reverse complement for each sequence
to-dna convert rna sequence(s) to dna
to-dna-template convert rna sequence(s) to dna template, includes T7...
to-fasta generate fasta file from csv
to-opool generate oligo pool file from csv
to-rna convert rna sequence(s) to dna
transcribe convert dna sequence(s) to rna
trim trim 5'/3' ends of sequences
add
Adds a sequence to the 5' and/or 3' end of a sequence.
$ seq_tools add -p5 "AAAA" "GGGGUUUUCCCC"
SEQ_TOOLS.get_input_dataframe - INFO - reading sequence GGGGUUUUCCCC
SEQ_TOOLS.handle_output - INFO - output->
name seq
sequence AAAAGGGGUUUUCCCC
Name: 0, dtype: object
ec
Calculate the extinction coefficient for each sequence.
$ seq-tools ec "GGGGUUUUCCCC"
SEQ_TOOLS.get_input_dataframe - INFO - reading sequence GGGGUUUUCCCC
SEQ_TOOLS.handle_ntype - INFO - determining nucleic acid type: RNA
SEQ_TOOLS.handle_output - INFO - output->
name seq
sequence GGGGUUUUCCCC
extinction_coeff 109500
Name: 0, dtype: object
edit-distance
Calculate the edit distance of a library. On average how different each sequence is from the rest of the library.
seq-tools edit-distance test/resources/test.csv
SEQ_TOOLS.edit_distance - INFO - edit distance: 17.666666666666668
fold
Fold rna sequences.
$ seq-tools fold "GGGGUUUUCCCC"
SEQ_TOOLS.get_input_dataframe - INFO - reading sequence GGGGUUUUCCCC
SEQ_TOOLS.handle_output - INFO - output->
name seq
sequence GGGGUUUUCCCC
structure ((((....))))
mfe -5.9
ens_defect 0.38
Name: 0, dtype: object
to-dna
Convert all sequences to DNA i.e. replace T with U.
$ seq_tools to-dna "GGGGUUUUCCCC"
SEQ_TOOLS.get_input_dataframe - INFO - reading sequence GGGGUUUUCCCC
SEQ_TOOLS.to_dna - INFO - converted sequence: GGGGTTTTCCCC
other non commandline
structure representation
from seq_tools import SequenceStructure
struct = SequenceStructure("GGGGUUUUCCCC", "((((....))))")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for rna_seq_tools-0.5.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6e1713236ad75a3451b1e4e5bcc5a6d5a5ca2d6e43ae60038d249d800e6f202a |
|
MD5 | 6415a98ca43347d3c8ac1fd12e4834c9 |
|
BLAKE2b-256 | 88d17460b2d2f5a2b152803c42598d462a5ff5c367dc6834203f142f18470adb |