Skip to main content

Cockatoo - intelligent clustering of metagenomic samples for coassembly

Project description

Cockatoo

Installation options

Install from pip

Install latest release via pip.

pip install cockatoo-genome

Install from source

Create conda env from cockatoo.yml and install from source.

git clone https://github.com/AroneyS/cockatoo.git
cd cockatoo
conda env create -f cockatoo.yml
conda activate cockatoo
pip install -e .

Cockatoo coassemble

Snakemake pipeline to discover coassembly sample clusters based on co-occurrence of single-copy marker genes, excluding those genes present in reference genomes (e.g. previously recovered genomes). Creates graph with samples as nodes and the number of overlapping sequences provided by SingleM. The taxa of the considered sequences can be filtered to target a specific taxon (e.g. the phylum Planctomycetota). The graph is clustered using the Girvan-Newman algorithm to provide sample groupings. Aviary assemble/recover commands are generated based on proposed coassemblies. Optionally, reads can be mapped to the matched bins with only unmapped reads being assembled.

# Example: cluster reads into proposed coassemblies based on unbinned sequences
cockatoo coassemble --forward reads_1.1.fq ... --reverse reads_1.2.fq ... --genomes genome_1.fna ...

# Example: cluster reads into proposed coassemblies based on unbinned sequences and coassemble only unbinned reads
cockatoo coassemble --forward reads_1.1.fq ... --reverse reads_1.2.fq ... --genomes genome_1.fna ... --assemble-unmapped

# Example: cluster reads into proposed coassemblies based on unbinned sequences from a specific taxa
cockatoo coassemble --forward reads_1.1.fq ... --reverse reads_1.2.fq ... --genomes genome_1.fna ... --taxa-of-interest "p__Planctomycetota"

# Example: find relevant samples for differential coverage binning (no coassembly)
cockatoo coassemble --forward reads_1.1.fq ... --reverse reads_1.2.fq ... --single-assembly

Cockatoo evaluate

Evaluates the recovery of target genes by coassemblies suggested by above, finding the number of target genes present in the newly recovered genomes. Compares the recovery by phyla and by single-copy marker gene.

# Example: evaluate a completed coassembly
cockatoo evaluate --coassemble-output coassemble_dir --aviary-outputs coassembly_0_dir ...

Cockatoo unmap

Applies unmapping to a previous Cockatoo coassemble run, generating unmapped reads files and Aviary commands.

# Example: generate unmapped reads and commands for completed coassembly
cockatoo unmap --coassemble-output coassemble_dir --forward reads_1.1.fq ... --reverse reads_1.2.fq ... --genomes genome_1.fna ...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cockatoo-genome-0.4.8.tar.gz (65.6 kB view details)

Uploaded Source

Built Distribution

cockatoo_genome-0.4.8-py3-none-any.whl (50.4 kB view details)

Uploaded Python 3

File details

Details for the file cockatoo-genome-0.4.8.tar.gz.

File metadata

  • Download URL: cockatoo-genome-0.4.8.tar.gz
  • Upload date:
  • Size: 65.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.1

File hashes

Hashes for cockatoo-genome-0.4.8.tar.gz
Algorithm Hash digest
SHA256 36b5ae1f6ed9c35bf06f99ba2d4bce3a6b8ebd9bd7e416abb253e955141cce36
MD5 e591b599c146ef1da0fef8ee55b8ce86
BLAKE2b-256 4e9c929bd4a57dc6b97d027c399b9b62a163158645ea480abd51ea3ead1e8d27

See more details on using hashes here.

File details

Details for the file cockatoo_genome-0.4.8-py3-none-any.whl.

File metadata

File hashes

Hashes for cockatoo_genome-0.4.8-py3-none-any.whl
Algorithm Hash digest
SHA256 b46e39aa0e332152f3a5141a331bffcf37a2d17f828762a7eab1f7ff198badb7
MD5 7e41883045835fe82ce4b4374c45a712
BLAKE2b-256 2030f8d79d7e6656fa3919874c50206d0138764dec1f691b900711ee9cbfda2e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page