Skip to main content

MetaPont - A tool to bridge the gap between the output of metagenomic tools and the analysis of the data

Project description

MetaPont

MetaPont - A tool to bridge the gap between the output of metagenomic tools and the analysis of the data

Features - These are the current aims of this project - Still under development

  • Targeted Functional Analysis: Search for specific functional IDs (e.g., GO terms) within the .tsv files provided by the HuwsLab Metagenome Workflow (https://github.com/TheHuwsLab/Metagenome_Workflow) .
  • Taxonomic Breakdown: Extract genus-level taxonomy information and calculate their proportions in the dataset.
  • Batch Processing: Analyse all .tsv files in a specified directory.
  • Customisable Output: Save results in a format suitable for downstream analysis.

Installation

Prerequisites

Ensure you have the following installed:

  • Python ~3.6 or later
  • Required Python libraries: argparse, csv, and collections (standard libs).

Installation via pip

MetaPont is provided as a pip distribution.

pip install MetaPont 

Usage

Command-line Arguments

The Extract-By-Function tool provides several command-line options:

Option Description Required Default
-d, --directory Directory containing .tsv files to analyse. Yes None
-f, --function_id Functional ID to search for (e.g., GO:0002). Yes None
-m, --min_proportion Minimum proportion needed for reporting. Yes 0.05 (5%)
-o, --output Output file name to save results. No output_taxa_proportions.tsv

Example

To search for the functional ID GO:0002 in all .tsv files within the data/ directory:

ExtractByFunction -d .../test_data/Final_contig/ -f GO:0002 -m 0.10 -o .../test_data/Final_Contig/Extract_By_Function_Out/results.tsv

Output

The tool generates a tab-delimited output file with the following columns:

  1. Sample: Name of the processed .tsv file.
  2. Taxa: Genus-level taxonomic assignment extracted from the Lineage column.
  3. Proportion: Proportion of matches to the given functional ID within the sample.

Example output:

Function ID: GO:0002
Sample	Taxa	Proportion
sample1.tsv	Escherichia	0.542857
sample1.tsv	Salmonella	0.457143
sample2.tsv	Bacillus	0.650000
sample2.tsv	Clostridium	0.350000

Implementation Details

Workflow

  1. The script reads .tsv files from the specified directory.
  2. For each file, it searches for occurrences of the given functional ID within specific columns.
  3. Matches are associated with genus-level taxonomic information extracted from the Lineage column.
  4. Taxa proportions are calculated and saved to the output file.

Large File Handling (Might be a failure point)

The script uses csv.field_size_limit to handle exceptionally large .tsv files.


Future Plans

  • Add support for additional file formats (e.g., .csv, .txt).
  • Expand functionality for more complex taxonomic and functional analyses.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metapont-0.0.1.tar.gz (25.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

MetaPont-0.0.1-py3-none-any.whl (28.2 kB view details)

Uploaded Python 3

File details

Details for the file metapont-0.0.1.tar.gz.

File metadata

  • Download URL: metapont-0.0.1.tar.gz
  • Upload date:
  • Size: 25.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.20

File hashes

Hashes for metapont-0.0.1.tar.gz
Algorithm Hash digest
SHA256 9b67c83ece1d717563827de4fc796b81937540dd79cad8d8e89d8bac9f1432ad
MD5 8f89081d4f4caf7a39850d7ea94db122
BLAKE2b-256 4eb8eb62abcade5c4fd9d86ec14ef002db0a47dfeb065d82e22aceff2d77d17c

See more details on using hashes here.

File details

Details for the file MetaPont-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: MetaPont-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 28.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.20

File hashes

Hashes for MetaPont-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c991577f6ba45dcdb1d425128032e15c92b83b085aa2e694d6b3652b6608e60c
MD5 14a3f48202d22761200ab2d12baf571b
BLAKE2b-256 967b220420a59efb44088efcc80ef698d2abf3a6ede4d2d24b27d25a829403a7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page