Biofilter: cloud-ready biological knowledge system

These details have not been verified by PyPI

Project links

Homepage

Project description

Biofilter 4

Biofilter 4 is a persistent, entity-centric biological knowledge platform designed to support gene-centric annotation, filtering, and modeling workflows through a unified and extensible data architecture.

This branch (biofilter3r) contains the active development of Biofilter 4, representing a major evolution of the Biofilter framework with a redesigned schema, modern ETL architecture, and multiple interaction layers.

📚 Documentation:
👉 https://biofilter.readthedocs.io/en/latest/

What is Biofilter 4?

Biofilter 4 provides a persistent, versioned biological knowledge base that replaces traditional file-based annotation workflows with a reusable, query-driven platform.

Instead of repeatedly generating transient annotation files, Biofilter 4 enables users to:

ingest curated biological knowledge once,
store it in a normalized, entity-based schema,
reuse and query that knowledge across analyses, projects, and environments.

Biofilter 4 is designed to support both exploratory research and production-scale workflows.

Key Features

Entity-centric data model
- Canonical entities (Gene, Variant, Disease, Protein, Pathway, etc.)
- Rich alias and cross-reference support
Persistent knowledge layer
- Versioned ETL packages
- Full provenance tracking by data source and load
Modular ETL architecture
- Data Transformation Packages (DTPs)
- Explicit separation of master data and relationships
High-performance ingestion
- Managed indexing strategy
- Optimized for large-scale sources (e.g. dbSNP, UniProt)
Multiple interaction layers
- Python API
- ORM-based Query layer
- Reusable Reports
- Command-line interface (CLI)
Multi-database support
- SQLite (local development)
- PostgreSQL (production and large-scale deployments)

Architecture Overview

At a high level, Biofilter 4 consists of:

ETL Layer
- Ingests external biological sources into a normalized schema
- Tracks execution via ETL Packages
Core Schema
- Entity, Alias, Relationship, and Domain Master tables
- Designed for extensibility and long-term evolution
Query Layer
- ORM-backed, Python-first access to the knowledge base
- Foundation for reports and advanced analysis
Report Layer
- Curated, reusable biological queries
- Standardized outputs as pandas DataFrames

Repository Structure (simplified)

biofilter/
├── alembic/          # Database migrations
├── cli/              # CLI commands and entrypoints
├── core/             # Core orchestration logic
├── db/               # Database models and schema
├── etl/              # ETL framework and DTPs
├── query/            # Query layer
├── report/           # Report framework
├── tools/            # Developer and admin utilities
├── utils/            # Shared helpers
├── biofilter.py      # Main Biofilter entry point
├── cli.py            # CLI bootstrap

docs/
├── source/           # Sphinx documentation source
└── requirements.txt  # Documentation build requirements

Documentation

The full User Guide and Developer Guide are hosted on Read the Docs:

📖 https://biofilter.readthedocs.io/en/latest/

The documentation covers:

Installation and setup
Data sources and ETL design
Writing DTPs
Managed indexes
Entity and alias registration
Query layer internals
Writing and extending reports
Developer tooling and project structure

Status

Current version: Biofilter 4 (active development)
Schema: Entity-centric, versioned
ETL: Modular DTP-based ingestion
Stability: Actively evolving; APIs and schema may continue to evolve prior to a formal 4.0 release

Contributing

Contributions, feedback, and design discussions are welcome.

When contributing:

Follow existing architectural patterns (Entities, DTPs, Reports).
Keep provenance and reproducibility as first-class concerns.
Prefer ORM-based logic over raw SQL when possible.
Document new features in the appropriate section of the docs.

License

MIT License. See LICENSE.

Acknowledgements

Biofilter builds on years of development and scientific usage across multiple generations of the framework. Biofilter 4 represents a continuation of this work, redesigned to support modern data volumes, richer biological relationships, and long-term sustainability.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

4.1.1

Mar 22, 2026

4.0.0

Mar 5, 2026

3.2.0

Dec 26, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biofilter-4.1.1.tar.gz (262.7 kB view details)

Uploaded Mar 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

biofilter-4.1.1-py3-none-any.whl (368.9 kB view details)

Uploaded Mar 22, 2026 Python 3

File details

Details for the file biofilter-4.1.1.tar.gz.

File metadata

Download URL: biofilter-4.1.1.tar.gz
Upload date: Mar 22, 2026
Size: 262.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.2 CPython/3.10.3 Darwin/25.3.0

File hashes

Hashes for biofilter-4.1.1.tar.gz
Algorithm	Hash digest
SHA256	`f06c97e12b4b99c0078bdc0a804af0a24850f97641bef5e8309b94ccc160ed44`
MD5	`b8bd915a59ae7d8a9988eece46e40173`
BLAKE2b-256	`f677b42463906286d2a3dcd06814dc832ab09508c6adac07d6c06353c6c98b4a`

See more details on using hashes here.

File details

Details for the file biofilter-4.1.1-py3-none-any.whl.

File metadata

Download URL: biofilter-4.1.1-py3-none-any.whl
Upload date: Mar 22, 2026
Size: 368.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.2 CPython/3.10.3 Darwin/25.3.0

File hashes

Hashes for biofilter-4.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8a171a85237dcea3456fd5a0cbdcf08f5578678b61c7615c02d36f249aad9abb`
MD5	`0eeb968b53b95b1ca5e7687a28bb76a2`
BLAKE2b-256	`3fd39c6a85060c6148eb440c622a38ece9848e9fc6cba5cc537439209acac39f`

See more details on using hashes here.

biofilter 4.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Biofilter 4

What is Biofilter 4?

Key Features

Architecture Overview

Repository Structure (simplified)

Documentation

Status

Contributing

License

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes