Interconvert various file formats supported by biopython. Supports querying records with JMESPath.
Project description
BioPython-Convert
Interconvert various file formats supported by BioPython.
Supports querying records with JMESPath.
Installation
pip install biopython-convert
or:
conda install biopython-convert
or:
git clone https://github.com/brinkmanlab/BioPython-Convert.git cd BioPython-Convert ./setup.py install
Use
biopython.convert [-s] [-v] [-i] [-q JMESPath] input_file input_type output_file output_type -s Split records into seperate files -q JMESPath to select records. Must return list of SeqIO records or mappings. Root is list of input SeqIO records. -i Print out details of records during conversion -v Print version and exit
- Supported formats
abi, abi-trim, ace, cif-atom, cif-seqres, clustal, embl, fasta, fasta-2line, fastq-sanger, fastq, fastq-solexa, fastq-illumina, genbank, gb, ig, imgt, nexus, pdb-seqres, pdb-atom, phd, phylip, pir, seqxml, sff, sff-trim, stockholm, swiss, tab, qual, uniprot-xml, gff3, txt, json, yaml
JMESPath
The root node for a query is a list of SeqRecord objects. The query can return a list with a subset of these or a mapping, keying to the constructor parameters of a SeqRecord object.
If the formats are txt, json, or yaml, then the JMESPath resulting object will simply be dumped in those formats.
- Examples:
Append a new record:
[@, [{'seq': 'AAAA', 'name': 'my_new_record'}]] | []
Filter out any plasmids:
[?!(features[?type=='source'].qualifiers.plasmid)]
Keep only the first record:
[0]
Output taxonomy of each record (txt output):
[*].annotations.taxonomy
Output json object containing id and molecule type:
[*].{id: id, type: annotations.molecule_type}
See CONTRIBUTING.rst for information on contributing to this repo.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.