Skip to main content

Utility package to parse multi fasta files resulting from de novo assembly

Project description

Contig Tools

Installation

pip3 install contig-tools

source code: https://gitlab.com/antunderwood/contig_tools

Usage

usage: contig-tools [-h] [-v] {filter,metrics,check_metrics,co_located} ...

    A package to maniuplate and assess contigs arising from de novo assemblies


positional arguments:
  {filter,metrics,check_metrics,co_located}
                        The following commands are available. Type
                        contig_tools <COMMAND> -h for more help on a specific
                        commands
    filter              Filter contigs based on either length and/or coverage
    metrics             Print contig metrics
    check_metrics       check contig metrics
    co_located          check to see if two or more loci are found on the same
                        contig.

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         display the version number

Examples

filter contigs

contig-tools filter -l 500 -c 3 -f contigs.fasta

print contig metrics

contig-tools metrics -f contig_tools/tests/test_data/contigs_for_checks.fas
contig-tools metrics -f contig_tools/tests/test_data/contigs_for_checks.fas -o json

check if contigs meet conditions based on conditions enoded in a yaml file

example yaml file

N50 score:
  condition_type: gt
  condition_value: 10
Largest contig:
  condition_type: gt
  condition_value: 15
Total length:
  condition_type: lt_gt
  condition_value:
    - 100
    - 50

example command

contig-tools check_metrics -f contigs.fasta -y conditions.yml

metrics that can be checked are

  • Number of contigs
  • Number of contigs > 500bp
  • Total length
  • %GC
  • Largest contig
  • N50 score

conditions that can be used are

  • gt => greater than
  • lt => less than
  • lt_gt => less than and greater than

check if a two or more loci are co-located

Make a fasta query file with the 2 or more loci you want to see if they are co-located e.g

>gene1
GCAGCTAGCGACTGCGAC.....
>gene2
CTACGTAGGACACGACTA....

There are two options

  1. Search a single genome file for the co-location of loci

    contig-tools co_located -q queries.fas -f /path/to/single/genome/contigs.fas
    

or

  1. Search a list of genomes for the co-location of loci Make a text file with paths to genomes e.g

    /path/to/single/genome1.fas
    /path/to/single/genome1.fas
    ....
    

    and then run the command

    contig-tools co_located -q queries.fas -l /path/to/single/genome_list_file.txt
    

    If you have muliple cores on the computer you are running this on you can process the search in parallel using the -n <NUMBER PARALLEL PROCESSES>.

    If you only want to write out genomes where the queries are co-located use the -y options

code

Code can be found here

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

contig tools-0.3.9.tar.gz (9.9 kB view details)

Uploaded Source

Built Distributions

contig_tools-0.3.9-py3.9.egg (20.4 kB view details)

Uploaded Source

contig_tools-0.3.9-py3-none-any.whl (12.0 kB view details)

Uploaded Python 3

File details

Details for the file contig tools-0.3.9.tar.gz.

File metadata

  • Download URL: contig tools-0.3.9.tar.gz
  • Upload date:
  • Size: 9.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for contig tools-0.3.9.tar.gz
Algorithm Hash digest
SHA256 2c9143ac39aea2387f3271d3a2e88951658c20cb54efbfb6e493916bdf2b4bae
MD5 79728bd64529d9965d9d1340c2f54f52
BLAKE2b-256 6e536ac11ee4c23d5697729eea0f7966adcfe3f14483fe0513991fe237df872e

See more details on using hashes here.

File details

Details for the file contig_tools-0.3.9-py3.9.egg.

File metadata

  • Download URL: contig_tools-0.3.9-py3.9.egg
  • Upload date:
  • Size: 20.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for contig_tools-0.3.9-py3.9.egg
Algorithm Hash digest
SHA256 65580f00c8f7e30533bf4f286aa27d9321a353691c6b73c931ea7c473403d8c4
MD5 7dfec9697164b4338f3aafd54d53fecf
BLAKE2b-256 97d3bb091d89512cef5843203892a2fd0001025cdd6036e133d262854205f9c9

See more details on using hashes here.

File details

Details for the file contig_tools-0.3.9-py3-none-any.whl.

File metadata

File hashes

Hashes for contig_tools-0.3.9-py3-none-any.whl
Algorithm Hash digest
SHA256 ea986d06362c8867ce959a0aac45c642126ba1488199e591ba872cc77dc70ba2
MD5 02c873ef42b101c4057ff1e1d6a705b3
BLAKE2b-256 37b0cc8aed621cffe362c6f8a596919e82e63db616bc1eed873dea77be03c301

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page