Skip to main content

A portable and modular meta-predictor for identifying Long Non-coding RNAs (lncRNAs).

Project description

metaLncRNA v2.0.0 ๐Ÿงฌ๐Ÿค–

metaLncRNA Logo

DOI University: UMC Laboratory: LaBiOmicS Bioinformatics

PyPI Version Open Source Open Science Open Data License: MIT JOSS Status CI Status

Python Version Powered by Ollama Ensemble Learning


metaLncRNA is a modular, high-performance Python framework designed to identify Long Non-coding RNAs (lncRNAs) by orchestrating an ensemble of seven diverse computational tools. It resolves the "reproducibility gap" by automating environment management and providing a robust consensus prediction through weighted soft-voting.


metaLncRNA Infographic


๐Ÿ“‚ Repository Structure

.
โ”œโ”€โ”€ conda/                   # Bioconda recipe and metadata
โ”œโ”€โ”€ deploy/                  # Containerization (Dockerfile, Singularity.def)
โ”œโ”€โ”€ docs/                    # Technical documentation and user guides
โ”œโ”€โ”€ examples/                # Quick-start samples (FASTA, config templates)
โ”œโ”€โ”€ galaxy/                  # Galaxy Tool wrapper and test data
โ”œโ”€โ”€ INPI_Registration/       # Legal software registration assets
โ”œโ”€โ”€ paper/                   # JOSS publication manuscript and bibliography
โ”œโ”€โ”€ scripts/                 # Bash scripts for HPC/Batch processing
โ”œโ”€โ”€ src/
โ”‚   โ””โ”€โ”€ metalncrna/          # Main Python Package
โ”‚       โ”œโ”€โ”€ cli.py           # Command-line interface entry point
โ”‚       โ”œโ”€โ”€ adapters/        # Wrappers for 7 lncRNA predictors
โ”‚       โ”œโ”€โ”€ engine/          # Core logic (Consensus, Dispatcher, Trainer)
โ”‚       โ”œโ”€โ”€ utils/           # AI Agent, Env management, Reports, FASTA handling
โ”‚       โ”œโ”€โ”€ data/            # Built-in weights and default configurations
โ”‚       โ””โ”€โ”€ third_party/     # Bundled legacy tools (CNCI, CPPred, LGC)
โ”œโ”€โ”€ tests/                   # Comprehensive Unit and Integration tests
โ”œโ”€โ”€ pyproject.toml           # Build system and dependency definitions
โ””โ”€โ”€ pixi.toml                # Environment management configuration

๐Ÿงฉ Core Components Detail

  • src/metalncrna/adapters/: Orchestrates external tools like RNAsamba, CPAT, CPC2, etc., providing a unified interface for prediction.
  • src/metalncrna/engine/:
    • consensus.py: Implements the weighted soft-voting algorithm.
    • dispatcher.py: Manages parallel execution of the ensemble.
  • src/metalncrna/utils/agent.py: Integrates with local LLMs (Ollama) for automated biological interpretation of results.
  • galaxy/: Allows metaLncRNA to be integrated into Galaxy instances, supporting reproducible web-based workflows.

๐Ÿ”ง Recent Fixes (v2.0.0)

  • CNCI Stability: Fixed a critical hang in the CNCI legacy tool caused by non-canonical nucleotides (e.g., K, V, M) and a multiprocessing deadlock in the original Python 2.7 implementation.
  • Improved Filtering: Implemented rigorous FASTA validation in the CNCI adapter to exclude sequences with ambiguous characters, preventing KeyError crashes.
  • Resource Optimization: Optimized tool dispatching by limiting CNCI threads to 4, reducing I/O overhead and improving performance for small-to-medium files.
  • Debug Resiliency: The dispatcher now preserves intermediate files automatically if a tool failure occurs, facilitating troubleshooting.

๐Ÿ”ง Previous Fixes (v1.2.1)

  • Consensus Logic: Updated consensus_support to reflect the number of tools that agree with the final consensus label, providing better interpretability.
  • CPC2 Integration: Fixed a critical parsing error where coding probability and label columns were mismatched (v1.1.8).
  • Cleanup: Removed unimplemented/experimental adapters to ensure stability.

โš™๏ธ Configuration

metaLncRNA follows a robust configuration loading order:

  1. Internal Defaults: Built-in weights and paths in src/metalncrna/data/default_config.yaml.
  2. Local Config: metaLncRNA_config.yaml in your current working directory.
  3. User Home: ~/.metalncrna/config.yaml.
  4. Explicit Path: Provided via the -c or --config flag.

๐Ÿš€ Key Features

  • Ensemble Prediction: Combines 7 tools (RNAsamba, CPAT, CPC2, PLEK, CNCI, CPPred, LGC).
  • Interactive AI Agent: Integrated local LLM assistant (Llama-3.2 or OpenBioLLM) to interpret results and explain classification decisions.
  • Reproducibility First: Built-in environment isolation via Mamba and Pixi.
  • Standardized Reports: Comprehensive TSV reports with tool congruence metrics.
  • Publication Ready: Formatted according to JOSS standards for scientific software.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metalncrna-2.0.0.tar.gz (1.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

metalncrna-2.0.0-py3-none-any.whl (1.9 MB view details)

Uploaded Python 3

File details

Details for the file metalncrna-2.0.0.tar.gz.

File metadata

  • Download URL: metalncrna-2.0.0.tar.gz
  • Upload date:
  • Size: 1.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for metalncrna-2.0.0.tar.gz
Algorithm Hash digest
SHA256 cddd83c4c0f08b80246c334837e67c1ebe9f65f000d9fa76671b83b9dbf5b574
MD5 9c186d60113925a3f0ceac43c931427e
BLAKE2b-256 5c336b7ab6bf04a617ef7d36ba2f2bf202ceb625d3687b8ccc9b41095df26b18

See more details on using hashes here.

File details

Details for the file metalncrna-2.0.0-py3-none-any.whl.

File metadata

  • Download URL: metalncrna-2.0.0-py3-none-any.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for metalncrna-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 468f267c14fb44869e1f5f42bea5b7f4c64ff6467a58a1a34c829697606f2a2d
MD5 4773087a2134389455ab22d5d0ba9dc3
BLAKE2b-256 b4f6b4fedb81f170ab47f0f0342071b22f8036633508b6ecec1a1187ac018edb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page