Skip to main content

Generate cov3 files used in DEMIC

Project description

pycov3

Tests codecov Renovate enabled Codacy Badge PyPI Bioconda DockerHub

A package for generating cov3 files which are generated from sam files giving coverage information and a fasta file giving binned contigs. Cov3 files are used as input for the DEMIC R package which calculates PTR, an estimate for bacterial growth rates.

Installation

PyPi

pip install pycov3
pycov3 -h

Bioconda

conda create -n pycov3 -c conda-forge -c bioconda pycov3
conda activate pycov3
pycov3 -h

DockerHub

docker pull chopmicrobiome/pycov3:latest
docker run --rm --name pycov3 pycov3 pycov3 -h

GitHub

git clone https://github.com/Ulthran/pycov3.git
cd pycov3/
pip install .
pycov3 -h

Usage

Use -h to see options for running the CLI.

$ pycov3 -h

The FASTAs should all be in one directory with names of the format {sample}.{bin_name}.fasta/.fa/.fna and the SAMs should also all be in one directory with names of the format {sample}_{bin_name}.sam. The output COV3 files will be written to a directory with names of the format {sample}.{bin_name}.cov3.

You can also use the library in your own code. Create a SAM directory and FASTA directory, set any non-default window or coverage parameters, then create a COV3 directory and use it to generate a COV3 file for each contig set in the FASTA directory.

    from pycov3.Directory import Cov3Dir, FastaDir, SamDir

    sam_d = SamDir(Path("/path/to/sams/"), False)

    window_params = {
        "window_size": None,
        "window_step": None,
        "edge_length": sam_d.calculate_edge_length(),
    }
    coverage_params = {
        "mapq_cutoff": None,
        "mapl_cutoff": None,
        "max_mismatch_ratio": None,
    }
    window_params = {k: v for k, v in window_params.items() if v is not None}
    coverage_params = {k: v for k, v in coverage_params.items() if v is not None}

    fasta_d = FastaDir(Path("/path/to/fastas/"), False)

    cov3_d = Cov3Dir(
        Path(args.out_dir),
        False,
        fasta_d.get_filenames(),
        window_params,
        coverage_params,
    )

    cov3_d.generate(sam_d, fasta_d)

Alternatively, to use the bare application logic and do all the file handling yourself, you can use the Cov3Generator class which takes a list of generators as SAM inputs and a generator as a FASTA input.

    from pycov3.Cov3Generator import Cov3Generator
    from pycov3.File import Cov3File

    cov3_generator = Cov3Generator(
        sam_generators,
        fasta_generator,
        sample,
        bin_name,
        window_params,
        **coverage_params,
    )

    cov3_dict = cov3_generator.generate_cov3()

    # Write output
    cov3_file = Cov3File(Path(/path/to/output/), "001")
    cov3_file.write_generator(cov3_generator.generate_cov3())

Resource Requirements

Threads: pycov3 uses multiprocessing to parallelize processing of input fastas. Increasing --thread_num up to the number of input fastas should improve runtime, with no benefits beyond that number.

Memory: pycov3 uses generators as much as possible. The main memory users are the Contig objects, which each hold a contig's sequence and information for each Window over its length. There is also a coverages dictionary that could potentially grow to the size of the largest contig (but that is very unlikely). At a minimum, twice the size of the largest contig should be given per thread.

Algorithmic Complexity: Assuming enough threads are provided to have each fasta file processed separately, the time complexity is roughly O(cwsr).

c: Number of contigs in fasta s: Number of sam files w: Max number of windows per contig r: Max number of records per sam file

Help

Please use the Issues on this repo for any problems, questions, or suggestions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pycov3-2.1.1.tar.gz (14.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pycov3-2.1.1-py3-none-any.whl (12.0 kB view details)

Uploaded Python 3

File details

Details for the file pycov3-2.1.1.tar.gz.

File metadata

  • Download URL: pycov3-2.1.1.tar.gz
  • Upload date:
  • Size: 14.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pycov3-2.1.1.tar.gz
Algorithm Hash digest
SHA256 5d77e98b191facd13284a3816abcb8f393afc13f462f504fbe52dae46a9f5771
MD5 feda886a5fceb1f03f95a1e6b7e84df2
BLAKE2b-256 850ca68a20fc174dbf5b239064f806c23a7e45086bca02c5d26f96e395693b89

See more details on using hashes here.

Provenance

The following attestation bundles were made for pycov3-2.1.1.tar.gz:

Publisher: release.yml on Ulthran/pycov3

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pycov3-2.1.1-py3-none-any.whl.

File metadata

  • Download URL: pycov3-2.1.1-py3-none-any.whl
  • Upload date:
  • Size: 12.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pycov3-2.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 efff0838365d9a258284e0ba2be1482e6561ba2e7bbee6c9844db2ed1c0d032f
MD5 2727efe2f7618f9aa87a4af27132a37f
BLAKE2b-256 4de5885d8e6be85fabda3e8912c7cf5799727617090a567e4596b4110ad97690

See more details on using hashes here.

Provenance

The following attestation bundles were made for pycov3-2.1.1-py3-none-any.whl:

Publisher: release.yml on Ulthran/pycov3

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page