A tool for handling DNA methylation data in extended bedMethyl format obtained from the nanopore sequencing data by Dorado and Modkit.
Project description
MethylomeMiner
MethylomeMiner is a tool for handling DNA methylation data in extended bedMethyl format obtained from the nanopore
sequencing data by Dorado and
Modkit.
Content
Features
- Filter bedMethyl tables based on valid coverage and the methylation rate.
- Sort base modifications into coding and non-coding types using genome annotation.
- Split base modifications according to reference sequences.
- Provide count statistics for each base modification, coding/non-coding types and reference sequence.
- Link coding base modifications with results from pangenome analysis conducted by Roary.
Installation
- Create a Python 3.12 or newer virtual environment.
- Install
MethylomeMinerusing:
python -m pip install methylome-miner
Usage
MethylomeMiner functionality is divided into two parts. The first part named MethylomeMiner deals with manipulation
of bedMethyl files, the other, PanMethylomeMiner, integrates pangenome analysis with DNA methylation data processed
by MethylomeMiner to create panmethylome.
MethylomeMiner
The primary use of MethylomeMiner is to filter out base modifications with low coverage and low rate
of base modification and to sort methylations into coding and non-coding groups. So-called coding methylations are
base modifications that are located in some coding region of the genome. On the other hand non-coding methylations
are found between coding parts of the genome.
The basic command to run MethylomeMiner from command line is following:
MethylomeMiner --input_bed_file path\to\bed_file.bed --input_annot_file path\to\annot_file.gff --min_coverage 35
Required parameters
--input_bed_file
Path to a bedMethyl file with genome-wide single-base methylation data.
--input_annot_file
Path to a file with genome annotation in gff (v3) or gbk file format.
--min_coverage and --input_bed_dir
Parameters --min_coverage and --input_bed_dir are mutually exclusive as they are used to assess the minimum coverage
of base modification to be retained.
The user can either set the threshold of minimum coverage by using parameter --min_coverage and an integer value
of requested coverage, or provide a path to directory with bedMethyl files (--input_bed_dir parameter)
and the threshold would be set to the median value of median values of coverage from individual bedMethyl files within
the directory.
Optional parameters
--min_percent_modified
Sets minimum required percentage of reads supporting base modification. The value is float in the range <0, 100>.
Default: 90.
--work_dir
Specifies the output directory where MethylomeMiner will save all result files.
If not provided, a folder named MethylomeMiner_output will be automatically created in the current working directory.
--file_name
Specifies a custom name to use for the output files generated by MethylomeMiner.
By default, MethylomeMiner use the name of the input bedMethyl file and add a specific suffix for each output
(see Output files)
--write_filtered_bed
A flag. When set, the script writes the filtered bedMethyl data to a new output file in working directory. Use this option if you want to save the filtered results.
--filtered_bed_format
Specifies the format of the filtered bedMethyl output file.
Available options: json, csv, tsv, bed.
Default: csv.
Use this option to control the format in which the filtered data is saved.
--split_by_reference
A flag. If set, all output files (except for filtered bedMethyl file) will be split and saved separately for each reference sequence.
Output files
The output of MethylomeMiner is a set of new files distinguished by file suffixes:
-
_coding.csv- extended bedMethyl table with information about coding region, where methylations are located.reference_seq start_index end_index modified_base_code score strand ... percent_modified ... gene_reference_seq gene_id gene_start gene_end gene_strand gene_product chromosome_contig_1 1108 1109 6mA 36 + ... 94.44 ... chromosome_contig_1 0792ceb9f153a28951804de2a656f49c_1 0 1335 + chromosomal replication initiator protein DnaA chromosome_contig_1 1753 1754 6mA 36 + ... 97.25 ... chromosome_contig_1 0792ceb9f153a28951804de2a656f49c_2 1339 2440 + DNA polymerase III subunit beta chromosome_contig_1 1867 1868 5mC 37 + ... 100.0 ... chromosome_contig_1 0792ceb9f153a28951804de2a656f49c_2 1339 2440 + DNA polymerase III subunit beta chromosome_contig_1 2699 2700 6mA 36 + ... 97.25 ... chromosome_contig_1 0792ceb9f153a28951804de2a656f49c_3 2647 3721 + DNA replication/repair protein RecF chromosome_contig_1 2728 2729 5mC 36 + ... 100.0 ... chromosome_contig_1 0792ceb9f153a28951804de2a656f49c_3 2647 3721 + DNA replication/repair protein RecF ... -
_non_coding.csv- extended bedMethyl table with information about coding regions, that are located in front of and after methylation. Modifications in front of the first coding feature have empty annotation marked asprev(previous). Modifications after the last coding feature have empty annotation marked asnext.reference_seq start_index end_index modified_base_code score strand ... percent_modified ... prev_gene_reference_seq prev_gene_id prev_gene_start prev_gene_end prev_gene_strand prev_gene_product next_gene_reference_seq next_gene_id next_gene_start next_gene_end next_gene_strand next_gene_product chromosome_contig_1 1482 1483 6mA 36 - ... 100.0 ... chromosome_contig_1 0792ceb9f153a28951804de2a656f49c_14 15874 16207 - YceK/YidQ family lipoprotein chromosome_contig_1 1601 1602 6mA 36 - ... 100.0 ... chromosome_contig_1 0792ceb9f153a28951804de2a656f49c_14 15874 16207 - YceK/YidQ family lipoprotein ... chromosome_contig_1 3553006 3553007 6mA 41 + ... 97.56 ... chromosome_contig_1 0792ceb9f153a28951804de2a656f49c_3407 3544141 3545086 + CDF family zinc transporter ZitB chromosome_contig_1 0792ceb9f153a28951804de2a656f49c_3435 3568768 3570052 + citrate synthase ... plasmid_contig_3 153394 153395 5mC 45 + ... 100.0 ... plasmid_contig_3 0792ceb9f153a28951804de2a656f49c_5106 151555 152305 + hypothetical protein plasmid_contig_3 153396 153397 5mC 36 - ... 100.0 ... plasmid_contig_3 0792ceb9f153a28951804de2a656f49c_5099 146025 146250 - IS1-like element transposase -
_all_annot_with_coding_methylations.csv- extended genome annotation table, where each coding region has a list of found methylations' positions distinguished by methylation type.record_id gene_id start end strand product note 4mC 5mC 6mA chromosome_contig_1 0792ceb9f153a28951804de2a656f49c_1 0 1335 + chromosomal replication initiator protein DnaA WP_015703901.1 chromosomal replication initiator protein DnaA (Klebsiella) [pid:94.8%, q_cov:100.0%, s_cov:95.1%, Eval:5.9e-240] [] [] [1108] chromosome_contig_1 0792ceb9f153a28951804de2a656f49c_2 1339 2440 + DNA polymerase III subunit beta WP_015369032.1 DNA polymerase III subunit beta (Klebsiella aerogenes) [pid:98.1%, q_cov:100.0%, s_cov:100.0%, Eval:5.2e-204] [] [1867] [1753] chromosome_contig_1 0792ceb9f153a28951804de2a656f49c_3 2647 3721 + DNA replication/repair protein RecF WP_015703900.1 DNA replication/repair protein RecF (Klebsiella aerogenes) [pid:97.5%, q_cov:100.0%, s_cov:100.0%, Eval:1.5e-200] [] [2728, 3115] [2699, 2827, 3179, 3640] chromosome_contig_1 0792ceb9f153a28951804de2a656f49c_4 3749 6164 + DNA topoisomerase (ATP-hydrolyzing) subunit B WP_015703899.1 DNA topoisomerase (ATP-hydrolyzing) subunit B (Klebsiella aerogenes) [pid:95.1%, q_cov:100.0%, s_cov:100.0%, Eval:0.0e+00] [] [3962, 4265, 4307, 4535, 4979, 5012, 5069, 5606, 5825, 5915] [4454, 4944, 5309, 5406, 5417, 5697, 5991, 6015, 6069] chromosome_contig_1 0792ceb9f153a28951804de2a656f49c_5 6365 7178 + sugar-phosphatase WP_015369035.1 sugar-phosphatase (Klebsiella aerogenes) [pid:94.8%, q_cov:100.0%, s_cov:100.0%, Eval:1.3e-142] [] [6857] [6696, 6801, 6809, 7023] ... -
_methylations_statistics.csv- table of filtered methylation counts showing the number of methylation sites detected in coding and non-coding regions, along with their totals and the overall number of methylations.genome 4mC_coding_count 4mC_non_coding_count 4mC_total_count 5mC_coding_count 5mC_non_coding_count 5mC_total_count 6mA_coding_count 6mA_non_coding_count 6mA_total_count total_count KP825 38 36 74 9425 10809 20234 15797 16386 32183 52491 -
_methylations_statistics_per_ref.csv- as above, but the counts are divided according to reference sequences in the annotation.genome reference_seq 4mC_coding_count 4mC_non_coding_count 4mC_total_count 5mC_coding_count 5mC_non_coding_count 5mC_total_count 6mA_coding_count 6mA_non_coding_count 6mA_total_count total_count KP825 chromosome_contig_1 33 33 66 8874 10033 18907 14967 15399 30366 49339 KP825 plasmid_contig_2 2 1 3 252 333 585 418 459 877 1465 KP825 plasmid_contig_3 3 2 5 299 443 742 412 528 940 1687 -
_filtered.*- optional, filtered bedMethyl table in CSV, TSV, JSON or BED format.
If --split_by_reference is used flag:
_coding.csvfile is replaced by<reference sequence 1>_coding.csv,<reference sequence 2>_coding.csvetc. for each reference sequence present in the annotation file,_non_coding.csvfile is replaced by<reference sequence 1>_non_coding.csv,<reference sequence 2>_non_coding.csvetc. for each reference sequence present in the annotation file.
Examples
-
To run
MethylomeMinerwith minimum coverage not specified, but instead path to directory with bedMethyl files is provided and minimum coverage is calculated as a median coverage value from all bedMethyl files in the directory:MethylomeMiner --input_bed_file input_files\KP825_4mC_5mC_6mA_calls.bed --input_annot_file input_files\KP825_genome.gff --input_bed_dir input_files
-
To set minimum coverage and minimum percentage of reads supporting base modification:
MethylomeMiner --input_bed_file input_files\KP825_4mC_5mC_6mA_calls.bed --input_annot_file input_files\KP825_genome.gff --min_coverage 36 --min_percent_modified 95
-
To set up working directory and custom name for output files:
MethylomeMiner --input_bed_file input_files\KP825_4mC_5mC_6mA_calls.bed --input_annot_file input_files\KP825_genome.gff --min_coverage 36 --work_dir path\to\output_dir --file_name output_file_name_prefix
-
To write filtered bedMethyl table to default CSV file:
MethylomeMiner --input_bed_file input_files\KP825_4mC_5mC_6mA_calls.bed --input_annot_file input_files\KP825_genome.gff --write_filtered_bed
-
To write filtered bedMethyl file and select its format:
MethylomeMiner --input_bed_file input_files\KP825_4mC_5mC_6mA_calls.bed --input_annot_file input_files\KP825_genome.gff --write_filtered_bed --filtered_bed_format bed
-
To split and save all output files (except for filtered bedMethyl file) by reference sequences:
MethylomeMiner --input_bed_file input_files\KP825_4mC_5mC_6mA_calls.bed --input_annot_file input_files\KP825_genome.gff --split_by_reference
PanMethylomeMiner
PanMethylomeMiner is designed to integrate the results from MethylomeMiner with pangenome analysis performed
by Roary. Its main purpose is to generate a comprehensive panmethylome table, in which each gene from the pangenome
is annotated with the detected methylations across multiple genomes.
The basic command to run PanMethylomeMiner from command line is following:
PanMethylomeMiner --input_bed_dir path\to\bed_dir --input_annot_dir path\to\annot_dir --roary_file path\to\gene_presence_absence.csv
PanMethylomeMiner uses MethylomeMiner to process all input files (bedMethyl and annotation files).
Therefore, some parameters in PanMethylomeMiner are similar to those in MethylomeMiner, allowing control over its
internal settings.
In the basic use case, MethylomeMiner filters base modifications using a default threshold of 90 % for the minimum
percentage of modified reads. The threshold for minimum coverage is automatically set to the median value calculated
from all bedMethyl files provided via the --input_bed_dir parameter.
Required parameters
--input_bed_dir
Path to a directory with bedMethyl files, which should be integrated into pangenome analysis.
--input_annot_dir
Path to a directory with annotation files, that were used as input into pangenome analysis. Genome annotations could be
in gff (v3) or gbk file format, but the locus tags for individual CDS have been renamed to correspond with
the gene names used in the Roary output.
Important note:
PanMethylomeMinerpairs bedMethyl and annotation files according to the file names, more precisely by a prefix. In this case, a prefix is every character in front of the first underscore_character. This prefix is later used to distinguish the individual genomes in the pangenome analysis.Make sure the files are named accordingly, for example:
prefix bedMethyl file annotation file KP825KP825_4mC_5mC_6mA_calls.bedKP825_genome.gffKP1651KP1651_4mC_5mC_6mA_calls.bedKP1651_genome.gffKP1824KP1824_4mC_5mC_6mA_calls.bedKP1824_genome.gff
--roary_file
Path to output file from Roary tool named gene_presence_absence.csv.
Optional parameters
--min_coverage
Sets minimum (as integer value) required read coverage for a base modification to be retained.
This option is only used if MethylomeMiner was not previously run for the input bedMethyl files
and no precomputed results are present in the work_dir folder.
If not provided, the median coverage will be automatically calculated from files in the input_bed_dir folder.
--min_percent_modified
Sets minimum required percentage of reads supporting base modification. The value is float in the range <0, 100>.
Default: 90.
This option is only used if MethylomeMiner was not previously run for the input bedMethyl files
and no precomputed results are present in the work_dir folder.
--matrix_values
Defines the format of values used in the output panmethylome matrix.
Options:
presence:0for no detected base modification,1for detected base modification, no value for missing gene in the corresponding genome.positions: A list of genomic positions where base modifications were identified within each pangenome gene or no value for missing gene in the corresponding genome.
Default: presence.
If a particular gene from pangenome was not found in a genome, value in the matrix is left empty.
--work_dir
Specifies the output directory where PanMethylomeMiner will look for results from MethylomeMiner and save all result
files.
If not provided, a folder named MethylomeMiner_output will be automatically created in the current working directory.
--write_all_results
A flag. If set, writes all result tables produced by MethylomeMiner to output files.
This includes:
- All detected methylation sites split into coding and non-coding regions.
- The extended genome annotation file with detected methylations located in coding regions.
- Filtered bedMethyl files after quality thresholds are applied.
Use this option if you want to save full intermediate results.
--heatmap_type
Specifies which type of heatmap visualization should be generated from the panmethylome presence matrix.
Heatmaps are only created if the matrix values (via --matrix_values) are set to presence.
Options:
compact: Generates a lower-resolution heatmap without gene names on the y-axis. Suitable for a quick overview across genomes.full: Generates a full-resolution heatmap with gene names. This version is more detailed but may result in a large image depending on the number of genes.both: Creates both compact and full heatmaps.
If no value is provided, no heatmap will be generated.
--heatmap_file_format
Sets the output format for generated heatmap(s). This applies to all generated heatmaps.
Aviable options are png, svg and pdf.
Default: pdf.
--heatmap_min_percent_presence
Sets the minimum percentage of genomes in which a gene must be present in order to be included in the heatmap(s).
This helps filter out genes that are absent from most genomes. For example --heatmap_min_percent_presence 90 will
include only genes present in at least 90 % of genomes. Value 100 % will create heatmap for core genes.
The value is float in the range <0, 100> and default is 95.
Output files
PanMethylomeMiner results are provided as a set of CSV files, each named according to the type of base modification
(e.g., 5mC, 6mA, 4mC) identified in the dataset.
For each detected modification, a panmethylome matrix is generated in the format specified by the user:
presence
The presence/absence matrix indicates whether a given pangenome gene contains any detected base modifications of the specified type.
File name format: <modification>_panmethylome_presence.csv
Example: 5mC_panmethylome_presence.csv
Matrix values:
1– base modification detected in the gene0– no modification detected- no value – the gene is absent in the corresponding genome
| Gene | KP1248 | KP1267 | KP1344 | KP1622 | KP1651 | KP1658 | KP1665 | KP1666 | KP1785 | KP1814 | KP1821 | KP1822 | KP1824 | KP1829 | KP1832 | KP1833 | KP1834 | KP1842 | KP1906 | KP825 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| recD | 1 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | |
| paaF | 0 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | |
| tynA | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 0 | |
| group_1004 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | |
| ompC | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | |
| ... |
positions
Matrix stores precise genomic positions of modified bases located within each pangenome gene.
File name format: <modification>_panmethylome_positions.csv
Example: 5mC_panmethylome_positions.csv
Matrix values:
- a list of integers - genomic positions of detected base modifications in the gene
- empty list - no modifications were detected
- no value - the gene is absent in the corresponding genome
| Gene | KP1248 | KP1267 | KP1344 | KP1622 | KP1651 | KP1658 | KP1665 | KP1666 | KP1785 | KP1814 | KP1821 | KP1822 | KP1824 | KP1829 | KP1832 | KP1833 | KP1834 | KP1842 | KP1906 | KP825 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| recD | [923982, 924145, 924181] | [994950, 995152, 995220, 995315] | [] | [901881, 902044, 902535] | [902535, 902698] | [901537, 901881, 902044, 902440, 902535, 902698, 902734] | [] | [901539, 901883, 902046, 902442, 902537, 902700, 902736] | [888326, 888468, 888670, 888833] | [900277, 900419, 900621, 900784, 901180, 901275, 901438, 901474] | [909850, 910143, 910539, 910634, 910797] | [909638, 909852, 910145, 910541, 910636] | [909636, 909850, 910143, 910539, 910634, 910797] | [909639, 909853, 910146, 910637, 910800] | [909849, 910142, 910633, 910796] | [909791, 909988, 910084, 910480, 910575] | [909794, 910087, 910483, 910578, 910741] | [] | [] | |
| paaF | [] | [2966497, 2966854] | [2931769] | [2868686, 2869043] | [2859549, 2859906] | [2859393, 2859750] | [] | [2859429] | [] | [2898167, 2898524] | [2922405, 2922762] | [2922416, 2922773] | [] | [2922408, 2922765] | [2922406] | [2922341, 2922698] | [2922351, 2922708] | [] | [] | |
| tynA | [] | [2978635, 2978860, 2978918, 2979184, 2979446, 2979719, 2979927, 2979977] | [] | [2880824, 2881049, 2881107, 2881635, 2881908, 2881929, 2882116, 2882166, 2882472, 2882835] | [2871687, 2871912, 2871970, 2872236, 2872498, 2872771, 2872792, 2872979, 2873029, 2873335, 2873698] | [2871531, 2872615, 2872823, 2873179, 2873542] | [] | [2872651, 2872859, 2872909, 2873578] | [] | [2911116, 2911389, 2911597, 2911647, 2911953, 2912071, 2912304, 2912316] | [2934543, 2934768, 2934826, 2935354, 2935627, 2935648, 2935835, 2935885, 2936191] | [] | [2934768, 2934826, 2935092, 2935627, 2936191, 2936228] | [] | [2934827, 2935628, 2935649, 2935836, 2936192, 2936229] | [2934479, 2934762, 2935563, 2935584, 2935771, 2936127] | [2934489, 2934714, 2934772, 2935300, 2935573, 2935594, 2935781, 2936137] | [] | [] | |
| group_1004 | [] | [2986640, 2986681, 2986802, 2986969, 2988160] | [] | [2888828, 2888869, 2888990, 2889157, 2890348] | [2879682, 2879723, 2879844, 2880011, 2881202] | [2879526, 2879567, 2879688, 2879855, 2881046] | [] | [2879562, 2879603, 2879724, 2879891, 2881082] | [2800569] | [2918309, 2918350, 2918471, 2918638, 2919829, 2919919] | [2942547, 2942588, 2942709, 2944067] | [2942599] | [2942547, 2942588, 2942709] | [2942550, 2942591, 2942712, 2944070] | [] | [2942483, 2942524] | [2942493, 2942534, 2943482, 2944013] | [] | [] | |
| ompC | [] | [] | [] | [] | [] | [] | [] | [2889656, 2889662] | [2809144] | [2928048, 2928088, 2928433, 2928439] | [2952274, 2952314, 2952641, 2952647] | [] | [] | [] | [] | [2952210, 2952250, 2952577, 2952583] | [2952587, 2952593] | [] | [] | |
| ... |
Statistics
Additionally, methylation statistics from MethylomeMiner are merged to provide an overview across all genomes
in pangenome dataset. Therefore, two CSV files are created:
-
all_methylations_statistics.csv- table of filtered methylation counts showing the number of methylation sites detected in coding and non-coding regions, along with their totals and the overall number of methylationsgenome 4mC_coding_count 4mC_non_coding_count 4mC_total_count 5mC_coding_count 5mC_non_coding_count 5mC_total_count 6mA_coding_count 6mA_non_coding_count 6mA_total_count total_count KP1248 20 18 38 6860 7973 14833 11027 11756 22783 37654 KP1267 113 141 254 17397 20724 38121 28528 30944 59472 97847 KP1344 6 15 21 3651 4483 8134 5799 6805 12604 20759 -
all_methylations_statistics_per_ref.csv- as above, but the counts are divided according to reference sequences in the annotationgenome reference_seq 4mC_coding_count 4mC_non_coding_count 4mC_total_count 5mC_coding_count 5mC_non_coding_count 5mC_total_count 6mA_coding_count 6mA_non_coding_count 6mA_total_count total_count KP825 chromosome_contig_1 33 33 66 8874 10033 18907 14967 15399 30366 49339 KP825 plasmid_contig_2 2 1 3 252 333 585 418 459 877 1465 KP825 plasmid_contig_3 3 2 5 299 443 742 412 528 940 1687 KP1248 chromosome_contig_1 19 17 36 6482 7474 13956 10514 11148 21662 35654 KP1248 plasmid_contig_2 0 1 1 119 141 260 152 197 349 610 KP1248 plasmid_contig_3 1 0 1 259 358 617 361 411 772 1390 KP1267 chromosome_contig_1 112 138 250 16958 20098 37056 27883 30075 57958 95264 KP1267 plasmid_contig_2 1 3 4 439 626 1065 645 869 1514 2583 KP1344 chromosome_contig_3 6 13 19 3332 4046 7378 5329 6255 11584 18981 KP1344 plasmid_contig_1 0 2 2 319 437 756 470 550 1020 1778
Examples
-
To run set custom filtering thresholds for
MethylomeMinerPanMethylomeMiner --input_bed_dir path\to\bed_dir --input_annot_dir path\to\annot_dir --roary_file path\to\gene_presence_absence.csv --min_coverage 36 --min_percent_modified 95
-
To change type of values in panmethylome matrix to
positionsPanMethylomeMiner --input_bed_dir path\to\bed_dir --input_annot_dir path\to\annot_dir --roary_file path\to\gene_presence_absence.csv --matrix_values positions
-
To set up working directory
PanMethylomeMiner --input_bed_dir path\to\bed_dir --input_annot_dir path\to\annot_dir --roary_file path\to\gene_presence_absence.csv --work_dir path\to\output_dir
-
To request all results from
MethylomeMinerto be saved to filesPanMethylomeMiner --input_bed_dir path\to\bed_dir --input_annot_dir path\to\annot_dir --roary_file path\to\gene_presence_absence.csv --write_all_results
-
To generate
compactheatmap in png file format for genes with 95 % presencePanMethylomeMiner --input_bed_dir path\to\bed_dir --input_annot_dir path\to\annot_dir --roary_file path\to\gene_presence_absence.csv --heatmap_type compact
-
To generate
fullheatmap in svg file format for genes with 95 % presencePanMethylomeMiner --input_bed_dir path\to\bed_dir --input_annot_dir path\to\annot_dir --roary_file path\to\gene_presence_absence.csv --heatmap_type full --heatmap_file_format svg
-
To generate
bothheatmaps (compact and full) in svg file format for genes with 100 % presence in analyzed genomes = core genesPanMethylomeMiner --input_bed_dir path\to\bed_dir --input_annot_dir path\to\annot_dir --roary_file path\to\gene_presence_absence.csv --heatmap_type both --heatmap_file_format svg --heatmap_min_percent_presence 100
Citation
If you use MethylomeMiner in your research, please cite the following article:
Jakubickova, M., Sabatova, K., Zbudilova, M., Bezdicek, M., Lengerova, M., Vitkova, H. MethylomeMiner: A novel tool for high-resolution analysis of bacterial methylation patterns from nanopore sequencing. Computational and Structural Biotechnology Journal, 27 (2025), pp. 4753–4759. https://doi.org/10.1016/j.csbj.2025.10.047
Formal citation metadata are available in the CITATION.cff file and via the Cite this repository button.
Contact
MethylomeMiner's maintainers:
- Katerina Sabatova - @KSabatova
- Marketa Jakubickova - @MakretaJakubickova
License
See the LICENSE file for license rights and limitations (GNU General Public License v3.0).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file methylome_miner-1.1.8.tar.gz.
File metadata
- Download URL: methylome_miner-1.1.8.tar.gz
- Upload date:
- Size: 42.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba67c48d9c7c51a105d0d9163ee524167b03655b42458cfdf68553b1a6d7f026
|
|
| MD5 |
f3ea497847b469564d9964e206df2518
|
|
| BLAKE2b-256 |
2bd384dad752cd9ca00cd33a74ef9dffb88e43a2dd8b4f2ae69b513fc39c3fa7
|
File details
Details for the file methylome_miner-1.1.8-py3-none-any.whl.
File metadata
- Download URL: methylome_miner-1.1.8-py3-none-any.whl
- Upload date:
- Size: 36.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
36923b2fec79acba537e5de092030d16679c21aaf5070a492cc8fe3affec0208
|
|
| MD5 |
4e795d217ed9ea7f21952f22fb467a06
|
|
| BLAKE2b-256 |
4f182244e75f189034a55be7c37034e67aee2c96f29be46ca52c260966dc12e2
|