MetaPont: A tool to bridge the gap between metagenomic tool output and its analysis.

These details have not been verified by PyPI

Project links

Project description

MetaPont

MetaPont - A tool to bridge the gap between the output of metagenomic tools and the analysis of the data

MetaPont is designed to work specifically with the output files generated by the HuwsLab Metagenome Workflow (github.com/TheHuwsLab/Metagenomic_Workflow) \

Directory structure most follow the format generated by the workflow).

.
└── samples
    ├── E1.0
    │   ├── E1.0_eggnog_mapper
    │   │   ├── E1.0_eggnogmapper.success
    │   │   ├── E1.0_pyrodigal_eggnog_mapped.emapper.annotations
    │   │   ├── E1.0_pyrodigal_eggnog_mapped.emapper.annotations.xlsx
    │   │   ├── E1.0_pyrodigal_eggnog_mapped.emapper.decorated.gff
    │   │   ├── E1.0_pyrodigal_eggnog_mapped.emapper.hits
    │   │   └── E1.0_pyrodigal_eggnog_mapped.emapper.seed_orthologs
    │   ├── E1.0_kraken2
    │   │   ├── E1.0_kraken2_report_mpa.txt
    │   │   ├── E1.0_kraken2_report.txt
    │   │   └── E1.0_kraken2.txt
    │   └── E1.0_readmapped
    │       ├── E1.0_bowtie2_db.3.bt2
    │       ├── E1.0_readmapped_cds_summary.txt
    │       ├── E1.0_readmapped_cds_summary.txt_new
    │       └── E1.0_readmapped_contig_summary.txt
    ├── E2.0
    │   ├── E2.0_eggnog_mapper
    │   │   ├── E2.0_eggnogmapper.success
    │   │   ├── E2.0_pyrodigal_eggnog_mapped.emapper.annotations
    │   │   ├── E2.0_pyrodigal_eggnog_mapped.emapper.annotations.xlsx
    │   │   ├── E2.0_pyrodigal_eggnog_mapped.emapper.decorated.gff
    │   │   ├── E2.0_pyrodigal_eggnog_mapped.emapper.hits
    │   │   └── E2.0_pyrodigal_eggnog_mapped.emapper.seed_orthologs
    │   ├── E2.0_kraken2
    │   │   ├── E2.0_kraken2_report_mpa.txt
    │   │   ├── E2.0_kraken2_report.txt
    │   │   └── E2.0_kraken2.txt
    │   └── E2.0_readmapped
    │       ├── E2.0_bowtie2_db.3.bt2
    │       ├── E2.0_readmapped_cds_summary.txt
    │       └── E2.0_readmapped_contig_summary.txt
    ├── E3.0
    │   ├── E3.0_eggnog_mapper
    │   │   ├── E3.0_eggnogmapper.success
    │   │   ├── E3.0_pyrodigal_eggnog_mapped.emapper.annotations
    │   │   ├── E3.0_pyrodigal_eggnog_mapped.emapper.annotations.xlsx
    │   │   ├── E3.0_pyrodigal_eggnog_mapped.emapper.decorated.gff
    │   │   ├── E3.0_pyrodigal_eggnog_mapped.emapper.hits
    │   │   └── E3.0_pyrodigal_eggnog_mapped.emapper.seed_orthologs
    │   ├── E3.0_kraken2
    │   │   ├── E3.0_kraken2_report_mpa.txt
    │   │   ├── E3.0_kraken2_report.txt
    │   │   └── E3.0_kraken2.txt
    │   └── E3.0_readmapped
    │       ├── E3.0_readmapped_cds_summary.txt
    │       └── E3.0_readmapped_contig_summary.txt
    └── Per_Sample_Contig_Outputs
        ├── E1.0_Contigs.tsv
        ├── E2.0_Contigs.tsv
        └── E3.0_Contigs.tsv

Features - These are the current aims of this project - Still under development

Targeted Functional Analysis: Search for specific functional IDs (e.g., GO terms) within the _Final_Contig.tsv files provided by the HuwsLab Metagenome Workflow (https://github.com/TheHuwsLab/Metagenome_Workflow) .
Taxonomic Breakdown: Extract genus-level taxonomy information and calculate their proportions in the dataset.
Batch Processing: Analyse all _Contig.tsv files in a specified directory.
Customisable Output: Save results in a format suitable for downstream analysis.

Installation

Prerequisites

Ensure you have the following installed:

Python ~3.10 or later

Installation via pip

MetaPont is provided as a pip distribution.

pip install MetaPont

Usage

MetaPont-Combine (or metapont-combine) Aggregate results from emapper kraken and read mapping for each sample

MetaPont-Combine -h

usage: MetaPont_Combine.py [-h] -d PARENT_DIRECTORY_PATH [-p PREFIX]

MetaPont: Combine emapper-kraken-reads

options:
  -h, --help            show this help message and exit
  -d PARENT_DIRECTORY_PATH, --parent_directory_path PARENT_DIRECTORY_PATH
                        Directory containing sample directories to analyse.
  -p PREFIX, --prefix PREFIX
                        Default - 'PN': Default directory name prefix to
                        identify sample directories to analyse..

The output will be saved in a new directory called Per_Sample_Contig_Outputs within the specified parent directory.
See Per_Sample_Contig_Outputs for example output files.

Contig-Coverage-Summary: Generate contig coverage summary from read mapping outputs

Contig-Coverage-Summary -h

usage: Contig_Coverage_Summary.py [-h] --root_dir ROOT_DIR --prefix PREFIX
                                  [--read-length READ_LENGTH]
                                  [--output OUTPUT]

MetaPont v0.0.9- Contig-Coverage-Summary: Aggregate readmapping contig
summaries and compute overview stats per sample.

options:
  -h, --help            show this help message and exit
  --root_dir ROOT_DIR, -d ROOT_DIR
                        Root directory containing sample folders (use
                        `root_dir` path).
  --prefix PREFIX, -p PREFIX
                        Comma-separated directory tags to search for (e.g.
                        E,L,P).
  --read-length READ_LENGTH, -r READ_LENGTH
                        Optional average read length to compute estimated
                        coverage.
  --output OUTPUT, -o OUTPUT
                        Output CSV path (default:
                        `root_dir/readmap_overview.csv`).

Report-Contig-Lineage: Generate contig lineage report from kraken2 outputs

Report-Contig-Lineage -h

usage: Report_Contig_Lineage.py [-h] -d DIR_PATH [--output OUTPUT]
                                [-s SEPARATE_TAXA] [-r REMOVE_TAXA]

MetaPont v0.0.9- Reporter-Contig-Lineage: Report contig lineage read counts
across samples, grouping by specified taxa substrings.

Required Arguments:
  -d DIR_PATH           Define the directory path containing the files

Optional Arguments:
  -s SEPARATE_TAXA, --separate-taxa SEPARATE_TAXA
                        Comma-separated list of taxa to separate (e.g.
                        d__Bacteria,d__Archaea). If omitted, defaults are
                        used.
  -r REMOVE_TAXA, --remove-taxa REMOVE_TAXA
                        Comma-separated list of taxa to remove. If omitted,
                        defaults are used.

Extract-By-Function Command-line Arguments

Extract-By-Function -h

usage: Extract_By_Function.py [-h] -d DIRECTORY -f FUNCTION_ID -o OUTPUT
                              [-m MIN_PROPORTION] [-top TOP_TAXA]

MetaPont v0.0.9: Extract-By-Function - Identify taxa contributing to a
specific function.

options:
  -h, --help            show this help message and exit
  -d DIRECTORY, --directory DIRECTORY
                        Directory containing TSV files to analyse.
  -f FUNCTION_ID, --function_id FUNCTION_ID
                        Specific function ID to search for (e.g.,
                        'GO:0016597').
  -o OUTPUT, --output OUTPUT
                        Output file to save results.
  -m MIN_PROPORTION, --min_proportion MIN_PROPORTION
                        Minimum proportion threshold for taxa to be included
                        in the output.
  -top TOP_TAXA, --top_taxa TOP_TAXA
                        Top n taxa to be included in the output.

The Extract-By-Function tool provides several command-line options:
Note: Either -m or -top is required.

Option	Description	Required	Default
`-d`, `--directory`	Directory containing `_Final_Contig.tsv` files to analyse.	Yes	None
`-f`, `--function_id`	Functional ID to search for (e.g., `GO:0016597`).	Yes	None
`-m`, `--min_proportion`	Minimum proportion needed for reporting.	Yes/No	None
`-top`, `--top_taxa`	Number of taxa to report.	Yes/No	None
`-o`, `--output`	Output file name to save results.	Yes	None

Example

To search for the functional ID GO:0016597 in all _Final_Contig.tsv files within the test_data/ directory:

Extract-By-Function -d .../test_data/Final_contig/ -f GO:0016597 -top 3 -o .../test_data/Final_Contig/Extract_By_Function_Out/results.tsv

Output

The tool generates a tab-delimited output file with the following columns:

Sample: Name of the processed Sample.
Taxa: Genus-level taxonomic assignment extracted from the Lineage column.
Reads Assigned (Function): Number of reads assigned to contigs with the given functional ID.
Proportion: Proportion of reads assigned to contigs of stated Taxa with the given functional ID within the sample.
Proportion (Total Reads): Proportion of reads assigned to contigs of stated Taxa with the given functional ID within the total reads of the sample.

Example output:

Function ID: GO:0016597
Sample	Taxa	Reads Assigned (Function)	Proportion (Function)	Proportion (Total Reads)
PN0536_0001_S1_Final_Contig.tsv	Lactobacillus	111963	0.602	0.004
PN0536_0003_S83_Final_Contig.tsv	Lactobacillus	20072	0.457	0.001
PN0536_0002_S2_Final_Contig.tsv	Acutalibacter	145222	0.795	0.005
PN0536_0004_S3_Final_Contig.tsv	Lactobacillus	40076	0.404	0.002

Extract-Function-By-Taxa:

usage: Extract_Function_By_Taxa.py [-h] -d DIRECTORY -t TAXON -f FUNCTION -o
                                   OUTPUT

MetaPont: Extract Reads Proportions for a Specific Taxon and Function

options:
  -h, --help            show this help message and exit
  -d DIRECTORY, --directory DIRECTORY
                        Directory containing TSV files to analyse.
  -t TAXON, --taxon TAXON
                        Target taxon to search for (e.g., 'g__Escherichia').
  -f FUNCTION, --function FUNCTION
                        Target function to extract (e.g., 'EC:2.7.11.1').
  -o OUTPUT, --output OUTPUT
                        Output file to save results.

Workflow - unfinished

The script reads _Final_Contig.tsv files from the specified directory.
For each file, it searches for occurrences of the given functional ID within specific columns.
Matches are associated with genus-level taxonomic information extracted from the Lineage column.
Taxa proportions are calculated and saved to the output file.

Extract-By-Taxa Command-line Arguments

Extract-By-Taxa -h

usage: Extract_By_Taxa.py [-h] -d DIRECTORY -t TAXON -o OUTPUT -func
                          FUNCTIONAL_CLASSES [-top TOP_FUNCTIONS]

MetaPont: Extract Top Functions by Taxon

options:
  -h, --help            show this help message and exit
  -d DIRECTORY, --directory DIRECTORY
                        Directory containing TSV files to analyse.
  -t TAXON, --taxon TAXON
                        Target taxon to search for (e.g., 'g__Bacillus').
  -o OUTPUT, --output OUTPUT
                        Output file to save results.
  -func FUNCTIONAL_CLASSES, --functional_classes FUNCTIONAL_CLASSES
                        Which functional classes to report (e.g. GO,EC,KEGG
                        etc).
  -top TOP_FUNCTIONS, --top_functions TOP_FUNCTIONS
                        Top n functions to include in the output for each
                        sample (default: 3).

The Extract-By-Taxa tool provides several command-line options:

Option	Description	Required	Default
`-d`, `--directory`	Directory containing `_Fincal_Contig.tsv` files to analyse.	Yes	None
`-t`, `--taxon`	Taxa to search for (e.g., `g__Bacillus`).	Yes	None
`-func`, `--functional_classes`	Functional classes to report (e.g. GO,EC,KEGG etc).	Yes	None
`-top`, `--top_taxa`	Number of functions to report (default 3).	No	None
`-o`, `--output`	Output file name to save results.	Yes	None

Example

To search for the top reported functions for taxon g__Bacillus in all _Final_Contig.tsv files within the test_data/ directory:

Extract-By-Taxa -d .../test_data/Final_Contig -t g__Bacillus -o .../test_data/Final_Contig/Extract_By_Taxa/results.tsv  -func GO

Output

The tool generates a tab-delimited output file with the following columns:

Sample: Name of the processed Sample.
Function: Reported 'top' function.
Num of Assignments (Functions): Number of times the function has been assigned across all contigs reported as chosen Taxon.

Example output:

Selected Taxon: g__Bacillus
Sample	Function	Num of Assignments
PN0536_0001_S1	GO:0008150	296
PN0536_0001_S1	GO:0003674	285
PN0536_0001_S1	GO:0005575	254
PN0536_0003_S83	GO:0005575	45
PN0536_0003_S83	GO:0008150	44
PN0536_0003_S83	GO:0003674	43
PN0536_0002_S2	GO:0005575	5
PN0536_0002_S2	GO:0008150	5
PN0536_0002_S2	GO:0005623	4
PN0536_0004_S3	GO:0008150	4
PN0536_0004_S3	GO:0003674	3
PN0536_0004_S3	GO:0005488	3

Large File Handling (Might be a failure point)

The script uses csv.field_size_limit to handle exceptionally large .tsv files.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.0.10

Dec 27, 2025

0.0.9

Dec 27, 2025

0.0.8

Mar 12, 2025

0.0.7

Mar 7, 2025

0.0.6

Feb 28, 2025

0.0.5

Feb 27, 2025

0.0.3

Nov 28, 2024

0.0.2

Nov 24, 2024

0.0.1

Nov 23, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metapont-0.0.10.tar.gz (60.0 kB view details)

Uploaded Dec 27, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

metapont-0.0.10-py3-none-any.whl (51.6 kB view details)

Uploaded Dec 27, 2025 Python 3

File details

Details for the file metapont-0.0.10.tar.gz.

File metadata

Download URL: metapont-0.0.10.tar.gz
Upload date: Dec 27, 2025
Size: 60.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for metapont-0.0.10.tar.gz
Algorithm	Hash digest
SHA256	`1e277c59758b49ae7646ca50a83ecc05c5397cd9f4f87466501b4338a5fc2088`
MD5	`bf6d6a608a7a27927d73abbd9941c641`
BLAKE2b-256	`97e8ace5a9b5847554ac12d9f89638613c88f236ad795685bf209e07ce7b0a07`

See more details on using hashes here.

File details

Details for the file metapont-0.0.10-py3-none-any.whl.

File metadata

Download URL: metapont-0.0.10-py3-none-any.whl
Upload date: Dec 27, 2025
Size: 51.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for metapont-0.0.10-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5824e66f00886db956bc4ef813c1ae75f024143ceff7d92ecc95b768321e3dea`
MD5	`ecb8485e11bc7e245b00d4cd775470a6`
BLAKE2b-256	`c60ab8fdd4ab8fe44aebe2758d8d0b8c03f40c51184cfe6d541b6340318d4d93`

See more details on using hashes here.

MetaPont 0.0.10

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MetaPont

Directory structure most follow the format generated by the workflow).

Features - These are the current aims of this project - Still under development

Installation

Prerequisites

Installation via pip

Usage

MetaPont-Combine (or metapont-combine) Aggregate results from emapper kraken and read mapping for each sample

Contig-Coverage-Summary: Generate contig coverage summary from read mapping outputs

Report-Contig-Lineage: Generate contig lineage report from kraken2 outputs

Extract-By-Function Command-line Arguments

Example

Output

Extract-Function-By-Taxa:

Workflow - unfinished

Extract-By-Taxa Command-line Arguments

Example

Output

Large File Handling (Might be a failure point)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes