A unified Python toolkit for biological data harmonization and ontology mapping
Project description
biomapper
A unified Python toolkit for biological data harmonization and ontology mapping. biomapper provides a single interface for standardizing identifiers and mapping between various biological ontologies, making multi-omic data integration more accessible and reproducible.
Features
Core Functionality
- ID Standardization: Unified interface for standardizing biological identifiers
- Ontology Mapping: Comprehensive ontology mapping using major biological databases
- Data Validation: Robust validation of input data and mappings
- Extensible Architecture: Easy integration of new data sources and mapping services
Supported Systems
ID Standardization Tools
- BridgeDb
- RefMet
- RaMP-DB
Ontology Mapping Services
- UMLS Metathesaurus
- Ontology Lookup Service (OLS)
- BioPortal
Installation
Development Setup
- Install Python 3.11 with pyenv (if not already installed):
# Install pyenv dependencies
sudo apt-get update
sudo apt-get install -y make build-essential libssl-dev zlib1g-dev \
libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev \
libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev python3-openssl
# Install pyenv
curl https://pyenv.run | bash
# Add to your shell configuration
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo 'command -v pyenv >/dev/null || export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(pyenv init -)"' >> ~/.bashrc
# Reload shell configuration
source ~/.bashrc
# Install Python 3.11
pyenv install 3.11.7
pyenv local 3.11.7
- Install Poetry (if not already installed):
curl -sSL https://install.python-poetry.org | python3 -
# Add Poetry to your PATH
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
- Clone and set up the project:
git clone https://github.com/yourusername/biomapper.git
cd biomapper
# Install dependencies with Poetry
poetry install
Quick Start
# Using Poetry's virtual environment
poetry shell
from biomapper import AnalyteMetadata
from biomapper.standardization import BridgeDBHandler
# Initialize metadata handler
metadata = AnalyteMetadata()
# Create standardization handler
bridge_handler = BridgeDBHandler()
# Process identifiers
results = bridge_handler.standardize(["P12345", "Q67890"])
Development
Using Poetry
# Activate virtual environment
poetry shell
# Run a command in the virtual environment
poetry run python script.py
# Add a new dependency
poetry add package-name
# Add a development dependency
poetry add --group dev package-name
# Update dependencies
poetry update
# Show currently installed packages
poetry show
# Build the package
poetry build
Running Tests
# Run tests
poetry run pytest
# Run tests with coverage
poetry run pytest --cov=biomapper
Code Quality
# Format code with black
poetry run black .
# Run linting
poetry run flake8 .
# Type checking
poetry run mypy .
Project Structure
biomapper/
├── biomapper/ # Main package directory
│ ├── core/ # Core functionality
│ │ ├── metadata.py # Metadata handling
│ │ └── validators.py # Data validation
│ ├── standardization/# ID standardization components
│ ├── mapping/ # Ontology mapping components
│ ├── utils/ # Utility functions
│ └── schemas/ # Data schemas and models
├── tests/ # Test files
├── docs/ # Documentation
├── scripts/ # Utility scripts
├── pyproject.toml # Poetry configuration and dependencies
└── poetry.lock # Lock file for dependencies
License
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
Support
For support, please open an issue in the GitHub issue tracker.
Roadmap
- Initial release with core functionality
- Add support for additional ontology services
- Implement caching layer
- Add batch processing capabilities
- Develop REST API interface
Acknowledgments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file biomapper-0.1.3.tar.gz.
File metadata
- Download URL: biomapper-0.1.3.tar.gz
- Upload date:
- Size: 30.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.11.7 Linux/6.9.3-76060903-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
da56a097d73ac05f48ed4f4947ec693b54fa302034fb4ac2057a5a4cb1295823
|
|
| MD5 |
3a8df7cacb2b06a382dfe4359180907f
|
|
| BLAKE2b-256 |
559d6aa92cf98f580cf54c5c72b3578ac400dde5eb9f46d5e2530eb6770cc9d4
|
File details
Details for the file biomapper-0.1.3-py3-none-any.whl.
File metadata
- Download URL: biomapper-0.1.3-py3-none-any.whl
- Upload date:
- Size: 29.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.4 CPython/3.11.7 Linux/6.9.3-76060903-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a91eec704d7311b55ae3880abd3c0759628efc8998b2ffb0ed2bb7ea9784dad
|
|
| MD5 |
c4913defdffad6e97440393f0e62a9d5
|
|
| BLAKE2b-256 |
de618149f0c620d3832f0c2b08f073c81d9fa894dd17cae10df5a1dd8c3fad1f
|