Gedcom-X tools RC1
Project description
gedcomtools
A comprehensive Python toolkit for parsing, converting, validating, and analyzing genealogical data using the GEDCOM 5.x, GEDCOM 7, and GEDCOM X data models.
gedcomtools provides:
- GEDCOM 5.x parser and high-level facade
- GEDCOM 7 parser, 18-phase validator, serializer, and interactive CLI
- GEDCOM X structured object model
- GEDCOM 5.x โ GEDCOM X conversion
- CLI tooling (
gxcli,g7cli,validate7) - Advanced logging via
loggingkit - Graph export utilities
Designed for historical records processing, genealogy research, and archival data pipelines.
Features
- โ
GEDCOM 5.x parser (
gedcom5) - โ
GEDCOM 7 parser, validator, and serializer (
gedcom7) - ๐ง GEDCOM X object model (
gedcomx) โ in progress - ๐ง Converter (GEDCOM 5.x โ GEDCOM X) โ in progress
- โ
CLI tools (
gxcli,g7cli,validate7) - โ
Structured logging (
loggingkit) - โ Sub-loggers (conversion, parser, io, etc.)
- โ Runtime log inspection
- โ Extensible schema system
- โ Source, person, family, relationship modeling
- โ Place and event normalization
- โ Metadata and attribution handling
- ๐ง Graph database export (ArangoDB) โ in progress
Project Structure
gedcomtools/
โโโ gedcom5/ # GEDCOM 5.x parsing layer
โ โโโ gedcom5.py # High-level facade (Gedcom5)
โ โโโ parser.py # Low-level parser engine (Gedcom5x)
โ โโโ elements.py # Typed element/record classes
โ โโโ helpers.py # Element query helpers
โ โโโ tags.py # GEDCOM 5.x tag constants
โ โโโ source.py # Source record helpers
โโโ gedcom7/ # GEDCOM 7 parsing, validation, and serialization
โ โโโ gedcom7.py # Parser + Gedcom7 class
โ โโโ structure.py # In-memory tree node (GedcomStructure)
โ โโโ validator.py # 18-phase structural/semantic validator
โ โโโ writer.py # GEDCOM 7 serializer
โ โโโ models.py # High-level detail dataclasses
โ โโโ specification.py # Tag rules, cardinality, enumerations
โ โโโ g7interop.py # Tag โ URI mapping
โ โโโ exceptions.py # Exception hierarchy
โ โโโ g7cli.py # Interactive browser/editor shell
โ โโโ validate7.py # validate7 CLI entry point
โโโ gedcomx/ # GEDCOM X object model and conversion (in progress)
โโโ graph.py # Graph export (persons, relationships)
โโโ loggingkit.py # Structured logging framework
โโโ utils/ # Shared utilities
Installation
pip install gedcomtools
Or from source:
git clone https://github.com/cartwrightdj/gedcomtools.git
cd gedcomtools
pip install -e .
Quick Start
Parse GEDCOM 5.x
from gedcomtools.gedcom5 import Gedcom5
g = Gedcom5("family.ged")
for person in g.individual_details():
print(person.full_name, person.birth_year, person.death_year)
for family in g.family_details():
print(family.husband_xref, family.wife_xref, family.marriage_year)
Parse and validate GEDCOM 7
from gedcomtools.gedcom7 import Gedcom7
g = Gedcom7("family.ged")
issues = g.validate()
for issue in issues:
print(f"[{issue.severity}] {issue.code}: {issue.message}")
# Write back out
g.write("family_out.ged")
Convert GEDCOM 5.x โ GEDCOM X (in progress)
The GEDCOM X object model and converter are under active development. The core object model (persons, families, relationships, sources, events, places, names, facts) is implemented. Conversion and CLI tooling are still being worked on.
from gedcomtools.gedcomx import GedcomX, GedcomConverter
converter = GedcomConverter()
gx = converter.Gedcom5x_GedcomX(ged)
CLI Tools
validate7 โ GEDCOM 7 validator
validate7 family.ged
validate7 --lenient family.ged # suppress undeclared extension tag errors
| Exit code | Meaning |
|---|---|
| 0 | Clean (warnings may still be printed) |
| 1 | One or more validation errors |
| 2 | Not a GEDCOM 7 file |
| 3 | File not found or cannot be read |
g7cli โ interactive GEDCOM 7 browser/editor
g7cli family.ged
Commands: load, reload, write, validate, info, ls, cd, pwd,
show, find, set, add, rm, help, quit.
gxcli โ GEDCOM X CLI
gxcli convert input.ged output.json
Logging
The project uses loggingkit for structured logging.
from gedcomtools.loggingkit import setup_logging, LoggerSpec
mgr = setup_logging("gedcomtools")
mgr.get_sublogger(LoggerSpec(name="conversion"))
mgr.get_sublogger(LoggerSpec(name="parser"))
Library modules use:
from gedcomtools.loggingkit import get_log
log = get_log("conversion")
log.info("Starting conversion")
Design Goals
- Centralized logging control
- Library-safe imports (no logging side effects)
- Extensible schema support
- Accurate GEDCOM X modeling
- Robust error reporting
- CLI + API parity
- Clear separation of concerns
Roadmap
- GEDCOM 5.x โ GEDCOM 7 converter
- GEDCOM X โ GEDCOM 7 converter
- JSON-LD export
- RAG pipeline integration
- Full test suite
License
MIT License
Author
David J. Cartwright
Build genealogy tooling like infrastructure: structured, observable, extensible.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gedcomtools-0.6.0.tar.gz.
File metadata
- Download URL: gedcomtools-0.6.0.tar.gz
- Upload date:
- Size: 213.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b01c29430a3b02887dd897381366afcfebf88cb19288ff8e8a5abee274b7d02
|
|
| MD5 |
0aa70a498715e090639e5c9fa2e377f3
|
|
| BLAKE2b-256 |
d5794f972141666af28d0383f14a367a044312ff88318ec0480e38eebe9baf58
|
File details
Details for the file gedcomtools-0.6.0-py3-none-any.whl.
File metadata
- Download URL: gedcomtools-0.6.0-py3-none-any.whl
- Upload date:
- Size: 237.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b629442f2f29b3e80d32961b1e09c7730bff0fb091d4203905483e17bce494db
|
|
| MD5 |
514f53c928490484fb72f3d020c58ece
|
|
| BLAKE2b-256 |
96b4065fa2da3c694e23e688e5bd9250bda773cbbc8a326cdd748dc2f4cbf48e
|