A portable and modular meta-predictor for identifying Long Non-coding RNAs (lncRNAs).
Project description
metaLncRNA v1.1.8 ๐งฌ๐ค
metaLncRNA is a modular, high-performance Python framework designed to identify Long Non-coding RNAs (lncRNAs) by orchestrating an ensemble of seven diverse computational tools. It resolves the "reproducibility gap" by automating environment management and providing a robust consensus prediction through weighted soft-voting.
๐ Repository Structure
.
โโโ conda/ # Bioconda recipe and metadata
โโโ deploy/ # Containerization (Dockerfile, Singularity.def)
โโโ docs/ # Technical documentation and user guides
โโโ examples/ # Quick-start samples (FASTA, config templates)
โโโ galaxy/ # Galaxy Tool wrapper and test data
โโโ INPI_Registration/ # Legal software registration assets
โโโ paper/ # JOSS publication manuscript and bibliography
โโโ scripts/ # Bash scripts for HPC/Batch processing
โโโ src/
โ โโโ metalncrna/ # Main Python Package
โ โโโ cli.py # Command-line interface entry point
โ โโโ adapters/ # Wrappers for 7 lncRNA predictors
โ โโโ engine/ # Core logic (Consensus, Dispatcher, Trainer)
โ โโโ utils/ # AI Agent, Env management, Reports, FASTA handling
โ โโโ data/ # Built-in weights and default configurations
โ โโโ third_party/ # Bundled legacy tools (CNCI, CPPred, LGC)
โโโ tests/ # Comprehensive Unit and Integration tests
โโโ pyproject.toml # Build system and dependency definitions
โโโ pixi.toml # Environment management configuration
๐งฉ Core Components Detail
src/metalncrna/adapters/: Orchestrates external tools like RNAsamba, CPAT, CPC2, etc., providing a unified interface for prediction.src/metalncrna/engine/:consensus.py: Implements the weighted soft-voting algorithm.dispatcher.py: Manages parallel execution of the ensemble.
src/metalncrna/utils/agent.py: Integrates with local LLMs (Ollama) for automated biological interpretation of results.galaxy/: AllowsmetaLncRNAto be integrated into Galaxy instances, supporting reproducible web-based workflows.
๐ง Recent Fixes (v1.1.8)
- CPC2 Integration: Fixed a critical parsing error where coding probability and label columns were mismatched in the final report.
- Consensus Accuracy: Improved consensus support calculation by ensuring all functional tools are correctly accounted for.
- Cleanup: Removed unimplemented/experimental adapters to ensure stability.
โ๏ธ Configuration
metaLncRNA follows a robust configuration loading order:
- Internal Defaults: Built-in weights and paths in
src/metalncrna/data/default_config.yaml. - Local Config:
metaLncRNA_config.yamlin your current working directory. - User Home:
~/.metalncrna/config.yaml. - Explicit Path: Provided via the
-cor--configflag.
๐ Key Features
- Ensemble Prediction: Combines 7 tools (RNAsamba, CPAT, CPC2, PLEK, CNCI, CPPred, LGC).
- Interactive AI Agent: Integrated local LLM assistant (Llama-3.2 or OpenBioLLM) to interpret results and explain classification decisions.
- Reproducibility First: Built-in environment isolation via Mamba and Pixi.
- Standardized Reports: Comprehensive TSV reports with tool congruence metrics.
- Publication Ready: Formatted according to JOSS standards for scientific software.
๐ Documentation
For detailed instructions, please refer to our Documentation Hub:
- ๐ ๏ธ User Guide: Installation, common commands, and AI Chat usage.
- ๐๏ธ Technical Architecture: Ensemble methodology and AI-driven interpretation layer.
- ๐ง Troubleshooting: Common issues and hardware requirements.
๐ ๏ธ Quick Start
1. Installation
Option A: via pip (Fastest)
We recommend using a virtual environment:
python3 -m venv venv
source venv/bin/activate
pip install "metalncrna[agent]"
metalncrna setup
Option B: via Conda / Mamba
Perfect for bioinformaticians using Bioconda:
# Create environment from the provided file
mamba env create -f environment.yml
conda activate metalncrna
# Finalize setup
metalncrna setup
2. Run Integrated Pipeline
metalncrna predict -i transcripts.fasta -o ./results -p MyAnalysis
3. Ask the AI Agent
# Get a summary of your findings
metalncrna ask "Summarize the analysis results" -r ./results/MyAnalysis/metalncrna_results.tsv
๐ณ Deployment
Pre-configured definitions are available for Docker and Singularity/Apptainer in the deploy/ directory.
๐ค Contributing
Contributions are welcome! Please see our CONTRIBUTING.md for details.
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
Developed by LaBiOmicS - Laboratory of Bioinformatics and Omics Sciences. Institution: Universidade de Mogi das Cruzes (UMC)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file metalncrna-1.1.8.tar.gz.
File metadata
- Download URL: metalncrna-1.1.8.tar.gz
- Upload date:
- Size: 1.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a67a0840d47fe9e2f63d1468dd10ec7fbed6c75e9479ad356a286e925b9d0b6
|
|
| MD5 |
a4056968dd3a4180e0bde2d4bba95836
|
|
| BLAKE2b-256 |
e1c8de7123ea1dd4b9cacbfd95784bf6422579d21c65098921b6a057364bcbf0
|
File details
Details for the file metalncrna-1.1.8-py3-none-any.whl.
File metadata
- Download URL: metalncrna-1.1.8-py3-none-any.whl
- Upload date:
- Size: 1.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c46316855c8256c0e3d86fbf7e27b91a73fc65765e30b074b416c189fffebe8e
|
|
| MD5 |
83ca64d6cf77171079d4b2ac29b9081a
|
|
| BLAKE2b-256 |
1561d413aa58cccefd748c26c99eecf0837f978cfa952544fe4e3cebe704421b
|