A python package that takes an assembly result of a chloroplast genome and continues it by computing the scaffolding stage.
Project description
Khloraa: scaffolding stage
khloraascaf
is a Python3 package that implements a dedicated scaffolding method for chloroplast genomes.
From input data files, it computes combinations of Integer Linear Programming (ILP) programs and write the result of the best one in output files.
Please have a look to the documentation website for more details.
Quick installation
To install the khloraascaf
package from the PyPI repository, run the pip
command :
pip install khloraascaf
You can find more installation details in the docs/src/install.md file.
Quick usage example
from pathlib import Path
from khloraascaf import IR_REGION_ID, UN_REGION_ID, scaffolding
from khloraascaf.inputs import INSTANCE_NAME_DEF, SOLVER_CBC
from khloraascaf.outputs import (
fmt_contigs_of_regions_filename,
fmt_map_of_regions_filename,
)
from khloraascaf.run_metadata import (
fmt_io_config_metadata_filename,
fmt_solutions_metadata_filename,
)
#
# Prepare the scaffolding result directory
#
outdir = Path('scaffolding_result')
outdir.mkdir(exist_ok=True)
#
# Compute the scaffolding using the assembly data
#
outdir_gen = scaffolding(
Path('tests/data/ir_un/contig_attrs.tsv'),
Path('tests/data/ir_un/contig_links.tsv'),
'C0',
solver=SOLVER_CBC,
outdir=outdir,
)
#
# khloraascaf creates a directory with a unique name
# to put all the files it has created
#
assert outdir_gen in outdir.glob('*')
print(outdir_gen)
#
# See which files the scaffolding has produced:
#
files = set(outdir_gen.glob('*'))
assert len(files) == 4
#
# * The list of oriented contigs for each region
#
assert outdir_gen / fmt_contigs_of_regions_filename(
INSTANCE_NAME_DEF, [IR_REGION_ID, UN_REGION_ID],
) in files
#
# * The list of oriented regions
#
assert outdir_gen / fmt_map_of_regions_filename(
INSTANCE_NAME_DEF, [IR_REGION_ID, UN_REGION_ID],
) in files
#
# * YAML file containing all the arguments and options you used
# to run khloraascaf
#
assert outdir_gen / fmt_io_config_metadata_filename() in files
#
# * YAML file that contains metadata on the solutions
#
assert outdir_gen / fmt_solutions_metadata_filename() in files
Changelog
You can refer to the docs/src/changelog.md file for details.
What next?
Find a list of ideas in the docs/src/todo.md file.
Contributing
- If you find any errors, missing documentation or test, or you want to discuss features you would like to have, please post an issue (with the corresponding predefined template) here.
- If you want to help me code, please post an issue or contact me. You can find coding convention in the docs/src/contributing.md file.
References
- A part of the scaffolding method is described in this preprint:
📰 Victor Epain, Dominique Lavenier, and Rumen Andonov, ‘Inverted Repeats Scaffolding for a Dedicated Chloroplast Genome Assembler’, 3 June 2022, https://doi.org/10.4230/LIPIcs.
Licence
This work is licensed under a GNU-GPLv3 licence.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for khloraascaf-1.2.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b16b569cb15ef83f8f0da238ef1c5cd73e5506b9978b084e4f76d246ca068a9f |
|
MD5 | a22b4b1a59e50e9ddc08899aaa3c5c61 |
|
BLAKE2b-256 | a8e4a01a19f174b5ea71efd523334ba7d279a4673cf08537900fdf42f6396b86 |