simple functions for manipulating sequences and secondary structures in pandas dataframe format
Project description
seq_tools
a short python tool for working with sequences in dataframes
how to install
pip install rna_seq_tools
how to use
seq_tools
is a python package that contains a few functions for working with sequences in
dataframes. If there is a single sequence results are printed. If input is a csv then a new csv is
created with the results. Default output is "output.csv" but can be changed with the -o
flag.
$ seq_tools --help
Usage: seq_tools [OPTIONS] COMMAND [ARGS]...
a set scripts to manipulate sequences in csv files
Options:
--help Show this message and exit.
Commands:
add add a sequence to 5' and/or 3'
ec calculate the extinction coefficient for each sequence
edit-distance calculate the edit distance of a library
fold fold rna sequences
mw calculate the molecular weight for each sequence
rc calculate reverse complement for each sequence
to-dna convert rna sequence(s) to dna
to-dna-template convert rna sequence(s) to dna template, includes T7...
to-fasta generate fasta file from csv
to-opool generate oligo pool file from csv
to-rna convert rna sequence(s) to dna
transcribe convert dna sequence(s) to rna
trim trim 5'/3' ends of sequences
add
Adds a sequence to the 5' and/or 3' end of a sequence.
$ seq_tools add -p5 "AAAA" "GGGGUUUUCCCC"
SEQ_TOOLS.get_input_dataframe - INFO - reading sequence GGGGUUUUCCCC
SEQ_TOOLS.handle_output - INFO - output->
name seq
sequence AAAAGGGGUUUUCCCC
Name: 0, dtype: object
ec
Calculate the extinction coefficient for each sequence.
$ seq-tools ec "GGGGUUUUCCCC"
SEQ_TOOLS.get_input_dataframe - INFO - reading sequence GGGGUUUUCCCC
SEQ_TOOLS.handle_ntype - INFO - determining nucleic acid type: RNA
SEQ_TOOLS.handle_output - INFO - output->
name seq
sequence GGGGUUUUCCCC
extinction_coeff 109500
Name: 0, dtype: object
edit-distance
Calculate the edit distance of a library. On average how different each sequence is from the rest of the library.
seq-tools edit-distance test/resources/test.csv
SEQ_TOOLS.edit_distance - INFO - edit distance: 17.666666666666668
fold
Fold rna sequences.
$ seq-tools fold "GGGGUUUUCCCC"
SEQ_TOOLS.get_input_dataframe - INFO - reading sequence GGGGUUUUCCCC
SEQ_TOOLS.handle_output - INFO - output->
name seq
sequence GGGGUUUUCCCC
structure ((((....))))
mfe -5.9
ens_defect 0.38
Name: 0, dtype: object
to-dna
Convert all sequences to DNA i.e. replace T with U.
$ seq_tools to-dna "GGGGUUUUCCCC"
SEQ_TOOLS.get_input_dataframe - INFO - reading sequence GGGGUUUUCCCC
SEQ_TOOLS.to_dna - INFO - converted sequence: GGGGTTTTCCCC
other non commandline
structure representation
from seq_tools import SequenceStructure
struct = SequenceStructure("GGGGUUUUCCCC", "((((....))))")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for rna_seq_tools-0.6.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d0f93f1a1bce0e2433ded8e582d51203bacc86c3cc7eaeb2d5fa5d3a9d65db1d |
|
MD5 | b26d6c385f7acca8cc6b76b6ba7c7793 |
|
BLAKE2b-256 | d9f2c621fcfe73b5b02a5cfb40a97be95feb6593843208c6088e143a7d97116b |