Skip to main content

Merge multiple GenBank files using a spacer sequence

Project description

Merge multiple GenBank records using a defined spacer sequence

A small script to turn a multiple GenBank records (either in multiple files or a single multi-record file) into a single record.

Sequences are merged by concatenating them in order, and putting a spacer sequence between them. Spacer sequence length can be given in kbp. It is possible to pick an all-N spacer, or using a spacer consisting of all-frame stop codons.

Installation

pip install merge-gbk-records

Alternatively, clone this repository from GitHub, then run (in a python virtual environment)

pip install .

If this fails on older versions of Python, try updating your pip tool first:

pip install --upgrade pip

and then rerun the merge-gbk-records install.

merge-gbk-records is only developed and tested on Python releases still under active support by the Python project. At the moment, this means versions 2.7, 3.3, 3.4, 3.5 and 3.6. Specifically, no attempt at testing under Python versions older than 2.7 or 3.3 is being made.

If your system is stuck on an older version of Python, consider using a tool like Homebrew or Linuxbrew to obtain a more up-to-date version.

Usage

By default, merge-gbk-records will add a 20 kbp spacer of all Ns and output the merged record on the terminal.

merge-gbk-records first.gbk second.gbk > merged.gbk

You can set different lengths using -l or --length. To use a 5 kbp spacer, use:

merge-gbk-records --length 5 first.gbk second.gbk > merged.gbk

You can select an all-frame stop codon spacer instead using -s stop or --spacer stop:

merge-gbk-records --spacer stop first.gbk second.gbk > merged.gbk

Instead of writing to stdout, you can also write to a file using -o or --outfile:

merge-gbk-records --outfile merged.gbk first.gbk second.gbk

To print help about the command, just run it with -h or --help:

merge-gbk-records --help

License

All code is available under the Apache License version 2, see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

merge-gbk-records-0.2.0.tar.gz (7.8 kB view details)

Uploaded Source

Built Distribution

merge_gbk_records-0.2.0-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file merge-gbk-records-0.2.0.tar.gz.

File metadata

  • Download URL: merge-gbk-records-0.2.0.tar.gz
  • Upload date:
  • Size: 7.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.1

File hashes

Hashes for merge-gbk-records-0.2.0.tar.gz
Algorithm Hash digest
SHA256 4e204552a87787e1fbe3eba629c0e05dbe0cd6bddc81c256cc813066d67b8be4
MD5 f929ca13f6b41ddae0d77f387ca9577d
BLAKE2b-256 384a2c7b21e37ebf2811403e895671a69af179eafdb31d302ddc705b8941c0a3

See more details on using hashes here.

File details

Details for the file merge_gbk_records-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for merge_gbk_records-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c257e1f4613a4bc37adb3528bf816300e6b7d8db7bf020e3cb2539d3d4adb34d
MD5 90ff2f7ff1aa6a1e9ee1b0e4c1270da3
BLAKE2b-256 cfbfd16fc36cb01e0e29fb5af39cae233b8cde2f54fc6a495217256bccb29951

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page