Metadata assistant agent for Lobster AI - sample metadata and harmonization operations
Project description
lobster-metadata
Sample metadata management and harmonization for multi-omics datasets.
Installation
pip install lobster-metadata
Agents
| Agent | Description |
|---|---|
metadata_assistant |
Metadata operations specialist. Sample ID mapping, schema standardization, dataset validation, and disease annotation enrichment. |
Services
| Service | Purpose |
|---|---|
| SampleMappingService | Map sample IDs between datasets using multiple strategies |
| MetadataStandardizationService | Standardize metadata fields to Pydantic schemas |
| MetadataFilteringService | Filter datasets by metadata criteria |
| DiseaseStandardizationService | Standardize disease names to controlled vocabularies |
| DiseaseOntologyService | Map diseases to ontology terms (MONDO, DO) |
| ClinicalMetadataService | Extract and validate clinical metadata fields |
| SampleGroupingService | Group samples by metadata attributes |
| MicrobiomeFilteringService | Microbiome-specific metadata filtering |
Features
Sample ID Mapping
- Exact match between matrix and metadata identifiers
- Fuzzy matching with configurable similarity thresholds
- Pattern-based matching using regular expressions
- Metadata-based correlation for complex mapping scenarios
Metadata Standardization
- Transcriptomics schema (cell type, tissue, organism, disease)
- Proteomics schema (platform, quantification method, normalization)
- Microbiome schema (16S vs shotgun, taxonomic level, diversity)
Dataset Validation
- Sample count consistency between matrix and metadata
- Condition coverage verification for experimental design
- Control sample identification and validation
- Biological and technical replicate detection
- Platform consistency checks across samples
Disease Enrichment
Four-phase hierarchy for missing disease annotations:
- Column re-scan for disease-related field names
- LLM-based abstract extraction from publication context
- LLM-based methods section parsing
- Manual mapping fallback for known datasets
Multi-Omics Integration
- Cross-modality sample alignment
- Shared sample identification across data types
- Metadata merging with conflict resolution
Requirements
- Python 3.12+
- lobster-ai >= 1.0.0
Testing
To run the test suite for this package:
# Install in editable mode with dev dependencies
pip install -e ".[dev]"
# Run all tests
pytest tests/ -v
# Run specific test categories
pytest tests/services/metadata/ -v # Service tests only
pytest tests/agents/ -v # Agent tests only
Documentation
Full documentation: docs.omics-os.com/docs/agents/metadata
License
AGPL-3.0-or-later
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lobster_metadata-1.1.418.tar.gz.
File metadata
- Download URL: lobster_metadata-1.1.418.tar.gz
- Upload date:
- Size: 100.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9f94997bcee0b8d0ba52e913419de028102d0b8104cae96ae43528dd46dadd51
|
|
| MD5 |
8ebec25e1e9e2b55f8e57ece4bcf02c1
|
|
| BLAKE2b-256 |
4cd5b2c204338e33e24627e8863245cf39dfdd0c3b2fdb6878547de425d74bd2
|
Provenance
The following attestation bundles were made for lobster_metadata-1.1.418.tar.gz:
Publisher:
publish-packages.yml on the-omics-os/lobster
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
lobster_metadata-1.1.418.tar.gz -
Subject digest:
9f94997bcee0b8d0ba52e913419de028102d0b8104cae96ae43528dd46dadd51 - Sigstore transparency entry: 1429791892
- Sigstore integration time:
-
Permalink:
the-omics-os/lobster@2201d19fcf75f9b405154412a187b58a70ed66b6 -
Branch / Tag:
refs/tags/v1.1.418 - Owner: https://github.com/the-omics-os
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-packages.yml@2201d19fcf75f9b405154412a187b58a70ed66b6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file lobster_metadata-1.1.418-py3-none-any.whl.
File metadata
- Download URL: lobster_metadata-1.1.418-py3-none-any.whl
- Upload date:
- Size: 111.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2b0b56cdbce51bea1df2ce23b8e17055962855b2a6748537f91479311d43068
|
|
| MD5 |
f32eb3a430757a8734f3abea04ea968b
|
|
| BLAKE2b-256 |
bbb962eca0fcb5f7bae10209537a4a6edccd7e90fcd35b43b4090e334994df52
|
Provenance
The following attestation bundles were made for lobster_metadata-1.1.418-py3-none-any.whl:
Publisher:
publish-packages.yml on the-omics-os/lobster
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
lobster_metadata-1.1.418-py3-none-any.whl -
Subject digest:
a2b0b56cdbce51bea1df2ce23b8e17055962855b2a6748537f91479311d43068 - Sigstore transparency entry: 1429791918
- Sigstore integration time:
-
Permalink:
the-omics-os/lobster@2201d19fcf75f9b405154412a187b58a70ed66b6 -
Branch / Tag:
refs/tags/v1.1.418 - Owner: https://github.com/the-omics-os
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-packages.yml@2201d19fcf75f9b405154412a187b58a70ed66b6 -
Trigger Event:
push
-
Statement type: