Skip to main content

Source code to PanGBank API

Project description

PanGBank API

This repository contains the API used to manage the PanGBank database, which stores collections of pangenomes built with PPanGGOLiN.

The API is built with FastAPI and uses SQLModel as its ORM. It provides a RESTful interface for querying and exploring pangenome collections. Alongside the API, a command-line tool pangbank_db is included to manage the database.

🚀 Installation

PanGBank-api is organized into two main components:

  • Core package: Database models, CRUD operations, and CLI tools (pangbank_db)
  • API server: FastAPI-based REST API (optional)

Option 1: Install Core Package Only

For database management and CLI tools without the API server:

pip install pangbank-api

This installs:

  • Database models (pangbank_api.models)
  • Database utilities (pangbank_api.database, pangbank_api.config)
  • CRUD operations (pangbank_api.crud)
  • CLI tool pangbank_db for database management

Option 2: Install with FastAPI (Full API Server)

For running the REST API server:

pip install pangbank-api[fastapi]

This additionally installs:

  • FastAPI framework
  • API routers (pangbank_api.routers)
  • API server (pangbank_api.main)

Local Development Setup

  1. Clone the repository:

    git clone https://github.com/labgem/PanGBank-api.git
    cd PanGBank-api
    
  2. Create a virtual environment and install with FastAPI:

    python -m venv venv
    source venv/bin/activate
    pip install .[fastapi]
    
  3. Run the API in development mode:

    export PANGBANK_DB_PATH="<path/to/database.sqlite>"
    export PANGBANK_DATA_DIR="<path/to/pangenome_directory>"
    fastapi dev pangbank_api/main.py
    

PANGBANK_DB_PATH is the path to your SQLite database file. PANGBANK_DATA_DIR is the root directory containing your pangenome data and mash files.

🛠️ Managing the Database with pangbank_db

All CLI commands require the PANGBANK_DB_PATH environment variable to be set.

export PANGBANK_DB_PATH="<path/to/database.sqlite>"

Add a Collection Release

To add a new collection of pangenomes in the database, use:

pangbank_db add-collection-release <collection_release.json>

[!NOTE] This command requires two environment variables:

export PANGBANK_DB_PATH="<path/to/database.sqlite>"
export PANGBANK_DATA_DIR="<root/path/serving/pangenomes>"
JSON Schema Example
{
  "collection": {
    "name": "GTDB_all_sampled",
    "description": "GTDB all is a collection of pangenomes made of GTDB species that have at least 15 genomes."
  },
  "release": {
    "version": "1.0.0",
    "ppanggolin_version": "2.2.4",
    "pangbank_wf_version": "0.0.2",
    "pangenomes_directory": "GTDB_refseq/release_v1.0.0/data/pangenomes/", // relative to PANGBANK_DATA_DIR
    "release_note": "",
    "date": "2025-07-10",
    "mash_sketch": "GTDB_refseq/release_v1.0.0/data/mash_sketch/families_persistent_all.msh", // relative to PANGBANK_DATA_DIR
    "mash_version": "2.3"
  },
  "taxonomy": {
    "name": "GTDB",
    "version": "10-RS226",
    "ranks": "Domain; Phylum; Class; Order; Family; Genus; Species",
    "file": "/absolute/path/to/taxonomy.tsv"
  },
  "genome_sources": [
    {
      "name": "RefSeq",
      "file": "/absolute/path/to/genomes.tsv",
      "version": "",
      "description": "",
      "source": "",
      "url": ""
    }
  ],
  "genome_metadata_sources": [
    {
      "name": "GTDB 10-RS226 metadata",
      "description": "Metadata collected from GTDB. Some columns have been filtered out.",
      "url": "https://data.ace.uq.edu.au/public/gtdb/data/releases/release226/226.0/",
      "strain_attribute": "ncbi_strain_identifiers",
      "organism_name_attribute": "ncbi_organism_name",
      "file": "/absolute/path/to/metadata.tsv"
    }
  ]
}

Note

  • Paths for pangenomes_directory and mash_sketch must be relative to PANGBANK_DATA_DIR.
  • Paths for taxonomy.file, genome_sources[*].file, and genome_metadata_sources[*].file must be absolute file paths.

List Existing Collections

pangbank_db list-collection

Delete a Collection Release

pangbank_db delete-collection <collection_name> --release-version <version>

🗃️ Database Migrations with Alembic

We use Alembic to manage schema changes in the PanGBank database.

Create a new migration

Generate a migration after updating your SQLModel models (e.g., adding or changing columns):

alembic revision --autogenerate -m "Describe your change here"

Apply migrations to the database

This applies all pending migrations:

alembic upgrade head

Roll back the last migration (use with caution)

If something went wrong, you can revert the last migration:

alembic downgrade -1

Or go back to the base (empty schema):

alembic downgrade base

[!NOTE]

  • The SQLite database path is defined in config.py via the pangbank_db_path setting (PANGBANK_DB_PATH env var).
  • Alembic is configured to read this dynamically, so no need to change alembic.ini.

Contributing

  1. Fork the repository.
  2. Create a feature branch (git checkout -b feature-name).
  3. Commit your changes (git commit -m 'Add new feature').
  4. Push to the branch (git push origin feature-name).
  5. Open a pull request.

Contact

For any inquiries or issues, open an issue on the GitHub repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pangbank_api-0.2.0.tar.gz (31.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pangbank_api-0.2.0-py3-none-any.whl (39.9 kB view details)

Uploaded Python 3

File details

Details for the file pangbank_api-0.2.0.tar.gz.

File metadata

  • Download URL: pangbank_api-0.2.0.tar.gz
  • Upload date:
  • Size: 31.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pangbank_api-0.2.0.tar.gz
Algorithm Hash digest
SHA256 de709980099af90e4984b05afce280d03cf231c22da44ee0d71facd9f2f0f525
MD5 78d05c35c902e12c13aff8085a4de986
BLAKE2b-256 4eb5e3cc76eb2ee7f0b8d0efa8dc1f922fe15d6f5cccb02c79b33c127086d3eb

See more details on using hashes here.

Provenance

The following attestation bundles were made for pangbank_api-0.2.0.tar.gz:

Publisher: python-publish.yml on labgem/PanGBank-api

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pangbank_api-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pangbank_api-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 39.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pangbank_api-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a231b8fbf7b515f3a1cf021e23c3220a1d424d3cc00db93e051866f30495c22e
MD5 16ea7744944a20ecd2f0f1d87aeaa646
BLAKE2b-256 0e856b556c03d1d7beda4985d04c23fc78c883a07440647eea042e007322acc4

See more details on using hashes here.

Provenance

The following attestation bundles were made for pangbank_api-0.2.0-py3-none-any.whl:

Publisher: python-publish.yml on labgem/PanGBank-api

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page