Reconstruct Abusua Pedigree Studio session files from GA4GH Pedigree and Phenopackets Family inputs.
Project description
ga4gh2abusua
Reconstruct an Abusua Pedigree Studio session file (.json) from GA4GH inputs:
- a GA4GH Pedigree Standard message (KIN-relationship graph), and/or
- a Phenopackets v2 Family (proband + relatives + native PED-style pedigree).
This is the inverse of abusua2ga4gh. Pure Python, no runtime dependencies, Python ≥ 3.8.
What it does
Given either or both GA4GH artefacts, it rebuilds an Abusua session that the tool can open directly — restoring family topology, sex, affected status, conditions, carrier state, deceased status, and the proband, and mapping parentage back into Abusua's dual-layer model.
| Source field | → | Abusua field |
|---|---|---|
KIN:027 isBiologicalMotherOf |
→ | bioMotherId |
KIN:028 isBiologicalFatherOf |
→ | bioFatherId (+ paternity = reported if the edge says so, else confirmed) |
KIN:022 isAdoptiveParentOf |
→ | socialMotherId/socialFatherId + fosteredIn |
Family native pedigree maternalId/paternalId (0 = none) |
→ | bioMotherId/bioFatherId (a 0 father with a known mother ⇒ paternity = unknown) |
native pedigree sex, affectedStatus |
→ | sex, affected |
phenopacket diseases[].term.label |
→ | condition (free text) |
phenopacket feature HP:0032500 |
→ | carrier token in condition |
subject vital_status = DECEASED |
→ | deceased |
Family.proband |
→ | proband |
Family.consanguinousParents = true |
→ | a note on the proband |
When both inputs are supplied, the Family is applied first (topology, sex, affected status, clinical detail) and the GA4GH Pedigree is layered on to refine the biological-vs-social edge distinction and paternity certainty.
Install
pip install -e .
Command line
# Most complete: both inputs
ga4gh2abusua --family fam.family.json --pedigree fam.ga4gh-pedigree.json -o fam.json
# From a Family alone
ga4gh2abusua --family fam.family.json -o fam.json
# From a GA4GH Pedigree alone, to stdout
ga4gh2abusua --pedigree fam.ga4gh-pedigree.json --stdout
Python API
import json
from ga4gh2abusua import to_abusua_session
family = json.load(open("fam.family.json"))
pedigree = json.load(open("fam.ga4gh-pedigree.json"))
session, warnings = to_abusua_session(ga4gh_pedigree=pedigree, family=family)
json.dump(session, open("fam.json", "w"), indent=2)
for w in warnings:
print("note:", w)
The reconstructed session opens directly in Abusua Pedigree Studio (load via the Open / Load button).
Round-trip fidelity and limitations
The conversion preserves the genetically and clinically meaningful content. We verified that Abusua → abusua2ga4gh → ga4gh2abusua reproduces, for the bundled examples, the same number of individuals and unions, the same affected/carrier/deceased/proband sets, the same unknown-paternity cases, and a session that loads, lays out, renders, and compiles to a valid PED in the tool.
Some information is not representable in the GA4GH artefacts and therefore cannot survive a round-trip:
- Abusua-specific lineage overrides (
abusuaManual,ntoroManual). Neither standard has a place for a manually pinned matriclan or ntoro, so these are not exported byabusua2ga4ghand cannot be restored here. On load, Abusua re-derives clan from the maternal chain, so a derived clan is unaffected; only a founder's manually set clan is lost. - The
fosteredInflag when the foster parents are unknown. The native PED pedigree correctly represents an unknown father aspaternalId = 0(sopaternity = unknownis restored), but the boolean "this child was fostered in" is only recoverable when the source carried an explicitKIN:022adoptive edge in the GA4GH Pedigree. A fostered child with unrecorded social parents will come back with the correct biology but without thefosteredInflag set. - Layout coordinates and free-text notes are not part of the GA4GH artefacts; the tool re-lays-out on load, and notes are regenerated only where this converter adds them (e.g. the consanguinity note).
- Condition ontology ids are reduced to their free-text labels, matching how Abusua stores conditions; the MONDO/HPO ids in the source are not retained in the session.
All such losses are inherent to what the standards model, not to this converter, and the converter reports the notable ones in its warnings.
Tests
pytest
The suite includes a round-trip from the original Abusua example sessions through the forward GA4GH outputs and back.
Layout
src/ga4gh2abusua/
convert.py # GA4GH Pedigree / Phenopackets Family -> Abusua session
session.py # internal builder + Abusua .json serialisation
cli.py # command-line interface
examples/ # a sample family + ga4gh-pedigree pair
tests/ # pytest suite (incl. round-trip)
References
- GA4GH Pedigree Standard — https://pedigree.readthedocs.io/
- GA4GH Phenopacket Schema v2 (Family) — https://phenopacket-schema.readthedocs.io/
License
MIT.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ga4gh2abusua-0.1.0.tar.gz.
File metadata
- Download URL: ga4gh2abusua-0.1.0.tar.gz
- Upload date:
- Size: 14.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
13f3b2fc8ae7950cd89a2572cb58c207289b76cf6d050b4adb546accfe55f784
|
|
| MD5 |
8c47e546c4d078dcce3f49e0f2f056c1
|
|
| BLAKE2b-256 |
4a849fda57181f913b26e172050d35f75a9b19c7e6e52e992abc29f7516319b2
|
File details
Details for the file ga4gh2abusua-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ga4gh2abusua-0.1.0-py3-none-any.whl
- Upload date:
- Size: 12.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b092fa47d229d01428281d7e4a3ff596af15d8d78e03e1df0936dddaae9c3989
|
|
| MD5 |
7c7d682afaaa9ef4714d3056dd598c81
|
|
| BLAKE2b-256 |
de9a273cd453c21b55e620c5d8589826f1665dadf7368ae4e6e9698bd58beb3b
|