Skip to main content

MetaPont - A tool to bridge the gap between the output of metagenomic tools and the analysis of the data

Project description

MetaPont

MetaPont - A tool to bridge the gap between the output of metagenomic tools and the analysis of the data

Features - These are the current aims of this project - Still under development

  • Targeted Functional Analysis: Search for specific functional IDs (e.g., GO terms) within the .tsv files provided by the HuwsLab Metagenome Workflow (https://github.com/TheHuwsLab/Metagenome_Workflow) .
  • Taxonomic Breakdown: Extract genus-level taxonomy information and calculate their proportions in the dataset.
  • Batch Processing: Analyse all .tsv files in a specified directory.
  • Customisable Output: Save results in a format suitable for downstream analysis.

Installation

Prerequisites

Ensure you have the following installed:

  • Python ~3.6 or later
  • Required Python libraries: argparse, csv, and collections (standard libs).

Installation via pip

MetaPont is provided as a pip distribution.

pip install MetaPont 

Usage

Command-line Arguments

Extract-By-Function -h

usage: Extract-By-Function [-h] -d DIRECTORY -f FUNCTION_ID [-o OUTPUT] [-m MIN_PROPORTION]

MetaPont v0.0.2: Extract-By-Function - Identify taxa contributing to a specific function.

options:
  -h, --help            show this help message and exit
  -d DIRECTORY, --directory DIRECTORY
                        Directory containing TSV files to analyse.
  -f FUNCTION_ID, --function_id FUNCTION_ID
                        Specific function ID to search for (e.g., 'GO:0002').
  -o OUTPUT, --output OUTPUT
                        Output file to save results (default: output_taxa_details.tsv).
  -m MIN_PROPORTION, --min_proportion MIN_PROPORTION
                        Minimum proportion threshold for taxa to be included in the output (default: 0.05).

The Extract-By-Function tool provides several command-line options:

Option Description Required Default
-d, --directory Directory containing .tsv files to analyse. Yes None
-f, --function_id Functional ID to search for (e.g., GO:0002). Yes None
-m, --min_proportion Minimum proportion needed for reporting. Yes 0.05 (5%)
-o, --output Output file name to save results. No output_taxa_proportions.tsv

Example

To search for the functional ID GO:0002 in all .tsv files within the data/ directory:

ExtractByFunction -d .../test_data/Final_contig/ -f GO:0002 -m 0.10 -o .../test_data/Final_Contig/Extract_By_Function_Out/results.tsv

Output

The tool generates a tab-delimited output file with the following columns:

  1. Sample: Name of the processed .tsv file.
  2. Taxa: Genus-level taxonomic assignment extracted from the Lineage column.
  3. Proportion: Proportion of matches to the given functional ID within the sample.

Example output:

Function ID: GO:0002
Sample	Taxa	Reads Assigned (Function)	Proportion (Function)	Proportion (Total Reads)
PN0536_0003_S83_Final_Contig.tsv	Gordonibacter	60788	0.075	0.002
PN0536_0003_S83_Final_Contig.tsv	Streptomyces	115671	0.142	0.004
PN0536_0003_S83_Final_Contig.tsv	unknown	80890	0.099	0.003
PN0536_0003_S83_Final_Contig.tsv	Clostridium	51018	0.063	0.002
PN0536_0003_S83_Final_Contig.tsv	Lactobacillus	149909	0.184	0.005
PN0536_0003_S83_Final_Contig.tsv	Limosilactobacillus	79694	0.098	0.003

Implementation Details

Workflow

  1. The script reads .tsv files from the specified directory.
  2. For each file, it searches for occurrences of the given functional ID within specific columns.
  3. Matches are associated with genus-level taxonomic information extracted from the Lineage column.
  4. Taxa proportions are calculated and saved to the output file.

Large File Handling (Might be a failure point)

The script uses csv.field_size_limit to handle exceptionally large .tsv files.


Future Plans

  • Add support for additional file formats (e.g., .csv, .txt).
  • Expand functionality for more complex taxonomic and functional analyses.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metapont-0.0.2.tar.gz (25.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

MetaPont-0.0.2-py3-none-any.whl (28.8 kB view details)

Uploaded Python 3

File details

Details for the file metapont-0.0.2.tar.gz.

File metadata

  • Download URL: metapont-0.0.2.tar.gz
  • Upload date:
  • Size: 25.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.13.0

File hashes

Hashes for metapont-0.0.2.tar.gz
Algorithm Hash digest
SHA256 b1ed31aa0691126117952f00a80099ef1f684dfcdc18a2d33db0e3943a99fc9e
MD5 3e6f5ef214d767c0753a72cd6a91c332
BLAKE2b-256 5a2c3c3310cc9ac7d17b93c5744f754b153f2185c3a077504bb9f7f0e555dc30

See more details on using hashes here.

File details

Details for the file MetaPont-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: MetaPont-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 28.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.13.0

File hashes

Hashes for MetaPont-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 08aa0bf56f36419da08c92228b0336f1259d28b9320ab24ba6f6a85707d18ecf
MD5 f286cf02fd161420935fdfbadb775ac7
BLAKE2b-256 349f2b2a6b87e190778862fb588473e79c03a6b44bb769bb55d6aec0b4860f8c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page