Source code to PanGBank API
Project description
PanGBank API
This repository contains the API used to manage the PanGBank database, which stores collections of pangenomes built with PPanGGOLiN.
The API is built with FastAPI and uses SQLModel as its ORM.
It provides a RESTful interface for querying and exploring pangenome collections. Alongside the API, a command-line tool pangbank_db is included to manage the database.
🚀 Installation
Local API Setup
-
Clone the repository:
git clone https://github.com/labgem/PanGBank-api.git cd PanGBank-api
-
Create a virtual environment and install dependencies:
python -m venv venv source venv/bin/activate pip install .
-
Run the API in development mode:
export PANGBANK_DB_PATH="<path/to/database.sqlite>" export PANGBANK_DATA_DIR="<path/to/pangenome_directory>" fastapi dev pangbank_api/main.py
PANGBANK_DB_PATHis the path to your SQLite database file.PANGBANK_DATA_DIRis the root directory containing your pangenome data and mash files.
🛠️ Managing the Database with pangbank_db
All CLI commands require the PANGBANK_DB_PATH environment variable to be set.
export PANGBANK_DB_PATH="<path/to/database.sqlite>"
Add a Collection Release
To add a new collection of pangenomes in the database, use:
pangbank_db add-collection-release <collection_release.json>
[!NOTE] This command requires two environment variables:
export PANGBANK_DB_PATH="<path/to/database.sqlite>" export PANGBANK_DATA_DIR="<root/path/serving/pangenomes>"
JSON Schema Example
{
"collection": {
"name": "GTDB_all_sampled",
"description": "GTDB all is a collection of pangenomes made of GTDB species that have at least 15 genomes."
},
"release": {
"version": "1.0.0",
"ppanggolin_version": "2.2.4",
"pangbank_wf_version": "0.0.2",
"pangenomes_directory": "GTDB_refseq/release_v1.0.0/data/pangenomes/", // relative to PANGBANK_DATA_DIR
"release_note": "",
"date": "2025-07-10",
"mash_sketch": "GTDB_refseq/release_v1.0.0/data/mash_sketch/families_persistent_all.msh", // relative to PANGBANK_DATA_DIR
"mash_version": "2.3"
},
"taxonomy": {
"name": "GTDB",
"version": "10-RS226",
"ranks": "Domain; Phylum; Class; Order; Family; Genus; Species",
"file": "/absolute/path/to/taxonomy.tsv"
},
"genome_sources": [
{
"name": "RefSeq",
"file": "/absolute/path/to/genomes.tsv",
"version": "",
"description": "",
"source": "",
"url": ""
}
],
"genome_metadata_sources": [
{
"name": "GTDB 10-RS226 metadata",
"description": "Metadata collected from GTDB. Some columns have been filtered out.",
"url": "https://data.ace.uq.edu.au/public/gtdb/data/releases/release226/226.0/",
"strain_attribute": "ncbi_strain_identifiers",
"organism_name_attribute": "ncbi_organism_name",
"file": "/absolute/path/to/metadata.tsv"
}
]
}
Note
- Paths for
pangenomes_directoryandmash_sketchmust be relative toPANGBANK_DATA_DIR. - Paths for
taxonomy.file,genome_sources[*].file, andgenome_metadata_sources[*].filemust be absolute file paths.
List Existing Collections
pangbank_db list-collection
Delete a Collection Release
pangbank_db delete-collection <collection_name> --release-version <version>
🗃️ Database Migrations with Alembic
We use Alembic to manage schema changes in the PanGBank database.
Create a new migration
Generate a migration after updating your SQLModel models (e.g., adding or changing columns):
alembic revision --autogenerate -m "Describe your change here"
Apply migrations to the database
This applies all pending migrations:
alembic upgrade head
Roll back the last migration (use with caution)
If something went wrong, you can revert the last migration:
alembic downgrade -1
Or go back to the base (empty schema):
alembic downgrade base
[!NOTE]
- The SQLite database path is defined in
config.pyvia thepangbank_db_pathsetting (PANGBANK_DB_PATHenv var).- Alembic is configured to read this dynamically, so no need to change
alembic.ini.
Contributing
- Fork the repository.
- Create a feature branch (
git checkout -b feature-name). - Commit your changes (
git commit -m 'Add new feature'). - Push to the branch (
git push origin feature-name). - Open a pull request.
Contact
For any inquiries or issues, open an issue on the GitHub repository.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pangbank_api-0.1.2.tar.gz.
File metadata
- Download URL: pangbank_api-0.1.2.tar.gz
- Upload date:
- Size: 30.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1515fcf0c1a8ec6ea2cd624e2dda7a5597171b1a4b7932479425f7b137519494
|
|
| MD5 |
cca3976fa2345576dcee6398e9e0a0bb
|
|
| BLAKE2b-256 |
e5f7b9a68b9d5bed275802bc4f89954e09606bee3753d59201201611c92524ba
|
Provenance
The following attestation bundles were made for pangbank_api-0.1.2.tar.gz:
Publisher:
python-publish.yml on labgem/PanGBank-api
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pangbank_api-0.1.2.tar.gz -
Subject digest:
1515fcf0c1a8ec6ea2cd624e2dda7a5597171b1a4b7932479425f7b137519494 - Sigstore transparency entry: 708663694
- Sigstore integration time:
-
Permalink:
labgem/PanGBank-api@3967cb064ca24c1664817628eec012442208f13b -
Branch / Tag:
refs/tags/0.1.2 - Owner: https://github.com/labgem
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@3967cb064ca24c1664817628eec012442208f13b -
Trigger Event:
release
-
Statement type:
File details
Details for the file pangbank_api-0.1.2-py3-none-any.whl.
File metadata
- Download URL: pangbank_api-0.1.2-py3-none-any.whl
- Upload date:
- Size: 38.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
05c82bc1a73bc19369a302651add1ac883e6ed6392d48f30f630810fadcd1c2e
|
|
| MD5 |
359d8036d7b8ec405f8dc9d4786dbf26
|
|
| BLAKE2b-256 |
d36291e8f607008873907424f764d531e97e9617edbb937f5379e18049a9c0e9
|
Provenance
The following attestation bundles were made for pangbank_api-0.1.2-py3-none-any.whl:
Publisher:
python-publish.yml on labgem/PanGBank-api
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pangbank_api-0.1.2-py3-none-any.whl -
Subject digest:
05c82bc1a73bc19369a302651add1ac883e6ed6392d48f30f630810fadcd1c2e - Sigstore transparency entry: 708663711
- Sigstore integration time:
-
Permalink:
labgem/PanGBank-api@3967cb064ca24c1664817628eec012442208f13b -
Branch / Tag:
refs/tags/0.1.2 - Owner: https://github.com/labgem
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@3967cb064ca24c1664817628eec012442208f13b -
Trigger Event:
release
-
Statement type: