Skip to main content

Biofilter v3 โ€“ Legacy Edition, modernized with Poetry and APSW

Project description

๐Ÿงฌ Biofilter-LOKI 3.0.0

Biofilter-LOKI 3.0.0 is a lightweight, command-lineโ€“driven knowledge base builder designed to support BioBin and other legacy Biofilter workflows.
This version preserves the traditional LOKI architecture, while modernizing the codebase and deployment for current HPC environments.


๐ŸŽฏ Purpose & Design Goals

Biofilter-LOKI 3.0.0 was built to:

  • Maintain full compatibility with BioBin
  • Preserve the classic LOKI data model
  • Provide a simple CLI-based workflow
  • Support HPC module deployments
  • Enable rapid database builds for analysis pipelines

๐Ÿง  What Is LOKI?

LOKI (Library Of Knowledge Integration) is the knowledge ingestion engine behind Biofilter.
It builds a SQLite knowledge database by integrating multiple biological data sources, such as:

  • SNP โ†” Gene
  • Gene โ†” Pathway
  • Gene โ†” Ontology
  • Identifier mappings across databases

๐Ÿ—๏ธ Architecture Overview


External Sources
        โ”‚
        โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  loki-build  โ”‚  โ† CLI entry point
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
        โ”‚
        โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ SQLite Knowledge DB โ”‚
โ”‚    (LOKI schema)    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
        โ”‚
        โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Biofilter  โ”‚  โ† CLI entry point
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Key characteristics:

  • SQLite backend
  • Immutable batch loads
  • No entity-level curation
  • Optimized for downstream queries

๐Ÿ“ฆ Included Data Sources

Depending on build options, Biofilter-LOKI can ingest:

  • dbSNP
  • Entrez Gene
  • Gene Ontology (GO)
  • Pathways (KEGG / Reactome, if enabled)
  • Chain files (genome build liftover)
  • Identifier mappings

The available sources depend on how the package was built and deployed.


๐Ÿš€ Installation

Option 1 โ€” HPC Module (Recommended)

module load biofilter/3.0.0

Verify installation:

loki-build --version

Option 2 โ€” Python Environment

pip install biofilter==3.0.0

or using Conda:

conda install -c conda-forge biofilter

๐Ÿ› ๏ธ Building a Knowledge Database

Basic example:

loki-build \
  --knowledge loki.db \
  --load dbsnp entrez go

Update existing database:

loki-build \
  --knowledge loki.db \
  --update

Build from an archive:

loki-build \
  --from-archive loki_sources.tar.gz \
  --knowledge loki.db

๐Ÿ” Common CLI Options

Option Description
--knowledge Output SQLite database
--load Load specific sources
--update Update existing DB
--from-archive Load from source archive
--to-archive Save source archive
--no-optimize Skip DB optimization
--verbose Verbose logging

Run loki-build --help for full details.


๐Ÿงช Integration with BioBin

Biofilter-LOKI 3.0.0 is the reference backend for:

  • BioBin 2.x
  • Existing LOKI-based analysis pipelines
  • Legacy workflows used in ADSP, ECHO, and related projects

Example:

biobin \
  --settings biobin.conf \
  --knowledge loki.db

โŒ Known Limitations

  • No entity-level conflict resolution
  • No incremental curation
  • No Parquet or OLAP support
  • No variant-level functional annotations (e.g. VEP)
  • Schema is not extensible without breaking compatibility

These limitations are intentional, to preserve stability.


๐Ÿ”ฎ Future Direction

All future innovation is happening in:

๐Ÿ‘‰ Biofilter3R

Key differences:

Biofilter-LOKI Biofilter3R
SQLite PostgreSQL / hybrid
Immutable batches Master entities
CLI only CLI + Python
BioBin-focused Multi-domain
Legacy schema Modern relational

๐Ÿ“š Documentation

  • BioBin documentation
  • Internal Ritchie Lab Confluence pages
  • Historical Biofilter publications

๐Ÿง‘โ€๐Ÿ”ฌ Maintainers

Developed and maintained by the Ritchie Lab University of Pennsylvania


๐Ÿ“œ License

Distributed under the original Biofilter license. See LICENSE file for details.


Development Documentation in:

https://ritchielab.github.io/biofilter/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biofilter_loki-3.0.0.tar.gz (153.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

biofilter_loki-3.0.0-py3-none-any.whl (183.9 kB view details)

Uploaded Python 3

File details

Details for the file biofilter_loki-3.0.0.tar.gz.

File metadata

  • Download URL: biofilter_loki-3.0.0.tar.gz
  • Upload date:
  • Size: 153.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.10.3 Darwin/25.1.0

File hashes

Hashes for biofilter_loki-3.0.0.tar.gz
Algorithm Hash digest
SHA256 b8b1782cf35ffde9df4541099299d684bfb4ecb6ecb5a89db09dbbf71651d7c8
MD5 df868ace8d577da54208f94d3cb9cd02
BLAKE2b-256 4fda05c49ce199432fb10956d96a8f86a195e593b73f16e400670539fdbe2dbe

See more details on using hashes here.

File details

Details for the file biofilter_loki-3.0.0-py3-none-any.whl.

File metadata

  • Download URL: biofilter_loki-3.0.0-py3-none-any.whl
  • Upload date:
  • Size: 183.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.10.3 Darwin/25.1.0

File hashes

Hashes for biofilter_loki-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ae411c78a963aeecfea4843ffb7a471310840fcab87a66a6ba8033e5235dba44
MD5 20e5dd828a4ece3a6836467ae3fd26d8
BLAKE2b-256 d38a78e88f99daa0908277d3ab16f8e7c5863b91c874401cd6fa8fc23ea6bd41

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page