No project description provided

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

clinker

Temporarily down Both cblaster and clinker can now be used without installation on the CAGECAT webserver.

Gene cluster comparison figure generator

What is it?

clinker is a pipeline for easily generating publication-quality gene cluster comparison figures.

Given a set of GenBank files, clinker will automatically extract protein translations, perform global alignments between sequences in each cluster, determine the optimal display order based on cluster similarity, and generate an interactive visualisation (using clustermap.js) that can be extensively tweaked before being exported as an SVG file.

A note on scope:

clinker was designed primarily as a simple way to visualise groups of homologous biosynthetic gene clusters, which are typically small genomic regions with not many genes (as in the example GIF). It performs pairwise alignments of all genes in all input files using the aligner built into BioPython, then generates an interactive SVG document in the browser. The alignment stage will scale very poorly to multiple genomes with many genes, and the resulting visualisation will also be very slow given how many SVG elements it will contain. If you are looking to align entire genomes, you will likely be better served using tools built for that purpose (e.g. Cactus).

clinker visualisation demo

What we do

Solve coding anomalies in Windows environment

Installation

clinker can be installed directly through pip:

pip install clinker-windows

By cloning the source code from GitHub:

git clone https://github.com/xianyu-123/clinker-windows.git
cd clinker
pip install .

Citation

If you found clinker useful, please cite:

clinker & clustermap.js: Automatic generation of gene cluster comparison figures.
Gilchrist, C.L.M., Chooi, Y.-H., 2020.
Bioinformatics. doi: https://doi.org/10.1093/bioinformatics/btab007

Usage

Running clinker can be as simple as:

clinker clusters/*.gbk

This will read in all GenBank files inside the folder, align them, and print the alignments to the terminal. To generate the visualisation, use the -p/--plot argument:

clinker clusters/*.gbk -p <optional: file name to save static HTML>

clinker can also parse GFF3 files:

clinker cluster1.gff3 cluster2.gff3 -p

Note: a corresponding FASTA file of the same name (extensions ".fa", ".fsa", ".fna", ".fasta" or ".faa") must be found in the same directory as the GFF3, i.e. cluster1.fa and cluster2.fa.

See -h/--help for more information:

usage: clinker [-h] [--version] [-r RANGES [RANGES ...]] [-gf GENE_FUNCTIONS] [-na] [-i IDENTITY] [-j JOBS] [-s SESSION] [-ji JSON_INDENT] [-f] [-o OUTPUT] [-p [PLOT]] [-dl DELIMITER] [-dc DECIMALS] [-hl] [-ha] [-mo MATRIX_OUT] [-ufo] [files ...]

clinker: Automatic creation of publication-ready gene cluster comparison figures.

clinker generates gene cluster comparison figures from GenBank files. It performs pairwise local or global alignments between every sequence in every unique pair of clusters and generates interactive, to-scale comparison figures using the clustermap.js library.

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit

Input options:
  files                 Gene cluster GenBank files
  -r RANGES [RANGES ...], --ranges RANGES [RANGES ...]
                        Scaffold extraction ranges. If a range is specified, only features within the range will be extracted from the scaffold. Ranges should be formatted like: scaffold:start-end (e.g. scaffold_1:15000-40000)
  -gf GENE_FUNCTIONS, --gene_functions GENE_FUNCTIONS
                        2-column CSV file containing gene functions, used to build gene groups from same function instead of sequence similarity (e.g. GENE_001,PKS-NRPS).

Alignment options:
  -na, --no_align       Do not align clusters
  -i IDENTITY, --identity IDENTITY
                        Minimum alignment sequence identity [default: 0.3]
  -j JOBS, --jobs JOBS  Number of alignments to run in parallel (0 to use the number of CPUs) [default: 0]

Output options:
  -s SESSION, --session SESSION
                        Path to clinker session
  -ji JSON_INDENT, --json_indent JSON_INDENT
                        Number of spaces to indent JSON [default: none]
  -f, --force           Overwrite previous output file
  -o OUTPUT, --output OUTPUT
                        Save alignments to file
  -p [PLOT], --plot [PLOT]
                        Plot cluster alignments using clustermap.js. If a path is given, clinker will generate a portable HTML file at that path. Otherwise, the plot will be served dynamically using Python's HTTP server.
  -dl DELIMITER, --delimiter DELIMITER
                        Character to delimit output by [default: human readable]
  -dc DECIMALS, --decimals DECIMALS
                        Number of decimal places in output [default: 2]
  -hl, --hide_link_headers
                        Hide alignment column headers
  -ha, --hide_aln_headers
                        Hide alignment cluster name headers
  -mo MATRIX_OUT, --matrix_out MATRIX_OUT
                        Save cluster similarity matrix to file

Visualisation options:
  -ufo, --use_file_order
                        Display clusters in order of input files

Example usage
-------------
Align clusters, plot results and print scores to screen:
  $ clinker files/*.gbk

Only save gene-gene links when identity is over 50%:
  $ clinker files/*.gbk -i 0.5

Save an alignment session for later:
  $ clinker files/*.gbk -s session.json

Save alignments to file, in comma-delimited format, with 4 decimal places:
  $ clinker files/*.gbk -o alignments.csv -dl "," -dc 4

Generate visualisation:
  $ clinker files/*.gbk -p

Save visualisation as a static HTML document:
  $ clinker files/*.gbk -p plot.html

Cameron Gilchrist, 2020

Defining gene groups by function

By default, clinker automatically assigns a name and colour for each group of homologous genes. You can instead pre-assign names (i.e. functions) using the -gf/--gene_functions argument, which takes a 2-column comma-separated file like:

GENE_001,Cytochrome P450 
GENE_002,Cytochrome P450 
GENE_003,Methyltransferase
GENE_004,Methyltransferase

This will generate two groups, Cytochrome P450 (GENE_001 and 002), and Methyltransferase (GENE_003, GENE_004). If there any other homologous genes are identified, they will automatically be added to these groups.

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.0.29

May 2, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clinker-windows-0.0.29.tar.gz (128.9 kB view details)

Uploaded May 2, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

clinker_windows-0.0.29-py3-none-any.whl (129.4 kB view details)

Uploaded May 2, 2023 Python 3

File details

Details for the file clinker-windows-0.0.29.tar.gz.

File metadata

Download URL: clinker-windows-0.0.29.tar.gz
Upload date: May 2, 2023
Size: 128.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for clinker-windows-0.0.29.tar.gz
Algorithm	Hash digest
SHA256	`ad440694404458da09c78ea42f96898169d054ad61cb9ede4baee4117c5f8538`
MD5	`906981d0a7904c23d97ab9f1ba619078`
BLAKE2b-256	`14f6402d81a071b51a306964cea11710bf70a5546ad26319c83c4110a638be0e`

See more details on using hashes here.

File details

Details for the file clinker_windows-0.0.29-py3-none-any.whl.

File metadata

Download URL: clinker_windows-0.0.29-py3-none-any.whl
Upload date: May 2, 2023
Size: 129.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for clinker_windows-0.0.29-py3-none-any.whl
Algorithm	Hash digest
SHA256	`16bafefd4887066cc35c070c8475941b410343f29f8430dba401603a4631528c`
MD5	`34d23bf5e1a2221dde491cf3cde1d9b8`
BLAKE2b-256	`28352476bb1f028da699f0b1fd28317acb78bcf369419f894c0082d71a153dd3`

See more details on using hashes here.

clinker-windows 0.0.29

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

clinker

What is it?

A note on scope:

What we do

Installation

Citation

Usage

Defining gene groups by function

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes