A script for parsing GenBank files
Project description
read_genbank
A module to read and parse the features of a genbank file
To output a tab separated file:
$ read_genbank.py tests/phiX174.gbk
'phiX174' 'CDS' (('100', '627'),) {'gene': 'G'}
'phiX174' 'CDS' (('636', '1622'),) {'gene': 'H'}
'phiX174' 'CDS' (('1659', '3227'),) {'gene': 'A'}
'phiX174' 'CDS' (('2780', '3142'),) {'gene': 'B'}
'phiX174' 'CDS' (('3142', '3312'),) {'gene': 'K'}
'phiX174' 'CDS' (('3224', '3484'),) {'gene': 'C'}
'phiX174' 'CDS' (('3481', '3939'),) {'gene': 'D'}
'phiX174' 'CDS' (('3659', '3934'),) {'gene': 'E'}
'phiX174' 'CDS' (('3939', '4055'),) {'gene': 'J'}
'phiX174' 'CDS' (('4092', '5375'),) {'gene': 'F'}
To output the amino-acid translations in fasta format
$ read_genbank.py tests/phiX174.gbk -f faa
>phiX174[100..627]
MLQTFISRHNSNFFSDKLVLTSVTPASSAPVLQTPKATSSTLYFDSLTVNAGNGGFLHCIQMDTSVNAANQVVSVGADIAFDADPKFFACLVRFESSSVPTTLPTAYDVYPLNGRHDGGYYTVKDCVTIDVLPRTPGNNVYVGFMVWSNFTATKCRGLVSLNQVIKEIICLQPLK
>phiX174[636..1622]
MFGAIAGGIASALAGGAMSKLFGGGQKAASGGIQGDVLATDNNTVGMGDAGIKSAIQGSNVPNPDEAAPSFVSGAMAKAGKGLLEGTLQAGTSAVSDKLLDLVGLGGKSAADKGKDTRDYLAAAFPELNAWERAGADASSAGMVDAGFENQKELTKMQLDNQKEIAEMQNETQKEIAGIQSATSRQNTKDQVYAQNEMLAYQQKESTARVASIMENTNLSKQQQVSEIMRQMLTQAQTAGQYFTNDQIKEMTRKVSAEVDLVHQQTQNQRYGSSHIGATAKDISNVVTDAASGVVDIFHGIDKAVADTWNNFWKDGKADGIGSNLSRK
>phiX174[1659..3227]
MPPNLGGFFMVRSYYPSECHADYFDFERIEALKPAIEACGISTLSQSPMLGFHKQMDNRIKLLEEILSFRMQGVEFDNGDMYVDGHKAASDVRDEFVSVTEKLMDELAQCYNVLPQLDINNTIDHRPEGDEKWFLENEKTVTQFCRKLAAERPLKDIRDEYNYPKKKGIKDECSRLLEASTMKSRRGFAIQRLMNAMRQAHADGWFIVFDTLTLADDRLEAFYDNPNALRDYFRDIGRMVLAAEGRKANDSHADCYQYFCVPEYGTANGRLHFHAVHFMRTLPTGSVDPNFGRRVRNRRQLNSLQNTWPYGYSMPIAVRYTQDAFSRSGWLWPVDAKGEPLKATSYMAVGFYVAKYVNKKSDMDLAAKGLGAKEWNNSLKTKLSLLPKKLFRIRMSRNFGMKMLTMTNLSTECLIQLTKLGYDATPFNQILKQNAKREMRLRLGKVTVADVLAAQPVTTNLLKFMRASIKMIGVSNLQSFIASMTQKLTLSDISDESKNYLDKAGITTACLRIKSKWTAGGK
>phiX174[2780..3142]
MEQLTKNQAVATSQEAVQNQNEPQLRDENAHNDKSVHGVLNPTYQAGLRRDAVQPDIEAERKKRDEIEAGKSYCSRRFGGATCDDKSAQIYARFDKNDWRIQPAEFYRFHDAEVNTFGYF
>phiX174[3142..3312]
MSRKIILIKQELLLLVYELNRSGLLAENEKIRPILAQLEKLLLCDLSPSTNDSVKN
>phiX174[3224..3484]
MRKFDLSLRSSRSSYFATFRHQLTILSKTDALDEEKWLNMLGTFVKDWFRYESHFVHGRDSLVDILKERGLLSESDAVQPLIGKKS
>phiX174[3481..3939]
MSQVTEQSVRFQTALASIKLIQASAVLDLTEDDFDFLTSNKVWIATDRSRARRCVEACVYGTLDFVGYPRFPAPVEFIAAVIAYYVHPVNIQTACLIMEGAEFTENIINGVERPVKAAELFAFTLRVRAGNTDVLTDAEENVRQKLRAEGVM
>phiX174[3659..3934]
MVRWTLWDTLAFLLLLSLLLPSLLIMFIPSTFKRPVSSWKALNLRKTLLMASSVRLKPLNCSRLPCVYAQETLTFLLTQKKTCVKNYVRKE
>phiX174[3939..4055]
MSKGKKRSGARPGRPQPLRGTKGKRKGARLWYVGGQQF
>phiX174[4092..5375]
MSNIQTGAERMPHDLSHLGFLAGQIGRLITISTTPVIAGDSFEMDAVGALRLSPLRRGLAIDSTVDIFTFYVPHRHVYGEQWIKFMKDGVNATPLPTVNTTGYIDHAAFLGTINPDTNKIPKHLFQGYLNIYNNYFKAPWMPDRTEANPNELNQDDARYGFRCCHLKNIWTAPLPPETELSRQMTTSTTSIDIMGLQAAYANLHTDQERDYFMQRYHDVISSFGGKTSYDADNRPLLVMRSNLWASGYDVDGTDQTSLGQFSGRVQQTYKHSVPRFFVPEHGTMFTLALVRFPPTATKEIQYLNAKGALTYTDIAGDPVLYGNLPPREISMKDVFRSGDSSKKFKIAEGQWYRYAPSYVSPAYHLLEGFPFIQEPPSGDLQERVLIRHHDYDQCFQSVQLLQWNSQVKFNVTVYRNLPTTRDSIMTS
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
read_genbank-0.6.tar.gz
(22.7 kB
view hashes)
Built Distributions
read_genbank-0.6-py3.8.egg
(10.2 kB
view hashes)
read_genbank-0.6-py3-none-any.whl
(22.6 kB
view hashes)
Close
Hashes for read_genbank-0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95755a10087260482ed4f93ad0553994f3e9ef8df4febb86d70a8414ff91d4ce |
|
MD5 | c979033598b038fe6687cf604cfb6436 |
|
BLAKE2b-256 | f42ebdbb3ebb548e264cdd468d98025ee02fb8ad743923468be45d02cb76b4b6 |