Skip to main content

Reconstruct Abusua Pedigree Studio session files from GA4GH Pedigree and Phenopackets Family inputs.

Project description

ga4gh2abusua

Reconstruct an Abusua Pedigree Studio session file (.json) from GA4GH inputs:

  • a GA4GH Pedigree Standard message (KIN-relationship graph), and/or
  • a Phenopackets v2 Family (proband + relatives + native PED-style pedigree).

This is the inverse of abusua2ga4gh. Pure Python, no runtime dependencies, Python ≥ 3.8.


What it does

Given either or both GA4GH artefacts, it rebuilds an Abusua session that the tool can open directly — restoring family topology, sex, affected status, conditions, carrier state, deceased status, and the proband, and mapping parentage back into Abusua's dual-layer model.

Source field Abusua field
KIN:027 isBiologicalMotherOf bioMotherId
KIN:028 isBiologicalFatherOf bioFatherId (+ paternity = reported if the edge says so, else confirmed)
KIN:022 isAdoptiveParentOf socialMotherId/socialFatherId + fosteredIn
Family native pedigree maternalId/paternalId (0 = none) bioMotherId/bioFatherId (a 0 father with a known mother ⇒ paternity = unknown)
native pedigree sex, affectedStatus sex, affected
phenopacket diseases[].term.label condition (free text)
phenopacket feature HP:0032500 carrier token in condition
subject vital_status = DECEASED deceased
Family.proband proband
Family.consanguinousParents = true a note on the proband

When both inputs are supplied, the Family is applied first (topology, sex, affected status, clinical detail) and the GA4GH Pedigree is layered on to refine the biological-vs-social edge distinction and paternity certainty.


Install

pip install -e .

Command line

# Most complete: both inputs
ga4gh2abusua --family fam.family.json --pedigree fam.ga4gh-pedigree.json -o fam.json

# From a Family alone
ga4gh2abusua --family fam.family.json -o fam.json

# From a GA4GH Pedigree alone, to stdout
ga4gh2abusua --pedigree fam.ga4gh-pedigree.json --stdout

Python API

import json
from ga4gh2abusua import to_abusua_session

family   = json.load(open("fam.family.json"))
pedigree = json.load(open("fam.ga4gh-pedigree.json"))

session, warnings = to_abusua_session(ga4gh_pedigree=pedigree, family=family)
json.dump(session, open("fam.json", "w"), indent=2)
for w in warnings:
    print("note:", w)

The reconstructed session opens directly in Abusua Pedigree Studio (load via the Open / Load button).


Round-trip fidelity and limitations

The conversion preserves the genetically and clinically meaningful content. We verified that Abusua → abusua2ga4gh → ga4gh2abusua reproduces, for the bundled examples, the same number of individuals and unions, the same affected/carrier/deceased/proband sets, the same unknown-paternity cases, and a session that loads, lays out, renders, and compiles to a valid PED in the tool.

Some information is not representable in the GA4GH artefacts and therefore cannot survive a round-trip:

  • Abusua-specific lineage overrides (abusuaManual, ntoroManual). Neither standard has a place for a manually pinned matriclan or ntoro, so these are not exported by abusua2ga4gh and cannot be restored here. On load, Abusua re-derives clan from the maternal chain, so a derived clan is unaffected; only a founder's manually set clan is lost.
  • The fosteredIn flag when the foster parents are unknown. The native PED pedigree correctly represents an unknown father as paternalId = 0 (so paternity = unknown is restored), but the boolean "this child was fostered in" is only recoverable when the source carried an explicit KIN:022 adoptive edge in the GA4GH Pedigree. A fostered child with unrecorded social parents will come back with the correct biology but without the fosteredIn flag set.
  • Layout coordinates and free-text notes are not part of the GA4GH artefacts; the tool re-lays-out on load, and notes are regenerated only where this converter adds them (e.g. the consanguinity note).
  • Condition ontology ids are reduced to their free-text labels, matching how Abusua stores conditions; the MONDO/HPO ids in the source are not retained in the session.

All such losses are inherent to what the standards model, not to this converter, and the converter reports the notable ones in its warnings.

Tests

pytest

The suite includes a round-trip from the original Abusua example sessions through the forward GA4GH outputs and back.

Layout

src/ga4gh2abusua/
  convert.py    # GA4GH Pedigree / Phenopackets Family -> Abusua session
  session.py    # internal builder + Abusua .json serialisation
  cli.py        # command-line interface
examples/       # a sample family + ga4gh-pedigree pair
tests/          # pytest suite (incl. round-trip)

References

License

MIT.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ga4gh2abusua-0.1.0.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ga4gh2abusua-0.1.0-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file ga4gh2abusua-0.1.0.tar.gz.

File metadata

  • Download URL: ga4gh2abusua-0.1.0.tar.gz
  • Upload date:
  • Size: 14.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for ga4gh2abusua-0.1.0.tar.gz
Algorithm Hash digest
SHA256 13f3b2fc8ae7950cd89a2572cb58c207289b76cf6d050b4adb546accfe55f784
MD5 8c47e546c4d078dcce3f49e0f2f056c1
BLAKE2b-256 4a849fda57181f913b26e172050d35f75a9b19c7e6e52e992abc29f7516319b2

See more details on using hashes here.

File details

Details for the file ga4gh2abusua-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ga4gh2abusua-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for ga4gh2abusua-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b092fa47d229d01428281d7e4a3ff596af15d8d78e03e1df0936dddaae9c3989
MD5 7c7d682afaaa9ef4714d3056dd598c81
BLAKE2b-256 de9a273cd453c21b55e620c5d8589826f1665dadf7368ae4e6e9698bd58beb3b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page