Skip to main content

A Python module for parsing and analyzing GEDCOM files.

Project description

Simple GEDCOM Parser

A simplified Python library for extracting genealogy data from GEDCOM files, focused on two primary use cases:

  1. Extract basic person data - Get a clean list of people with vital information and family relationships
  2. Extract person-source relationships - Map people to their documentary sources

Features

  • Parse GEDCOM 5.5 files
  • Extract person data: names, birth/death dates and places, parents
  • Extract source citations linked to individuals
  • Simple, clean API designed for data analysis
  • Easy integration with pandas DataFrames

Quick Start

from gedcom import GedcomParser
import pandas as pd

# Parse GEDCOM file
parser = GedcomParser()
parser.parse_file('family_tree.ged')

# Get list of all people
people = parser.get_person_list()
people_df = pd.DataFrame(people)

# Get person-source relationships
person_sources = parser.get_person_sources()
sources_df = pd.DataFrame(person_sources)

API Reference

GedcomParser

parse_file(file_path, strict=False)

Parse a GEDCOM file.

Parameters:

  • file_path (str): Path to the GEDCOM file
  • strict (bool): If True, raise exceptions on parse errors

get_person_list()

Returns a list of dictionaries containing person data:

[
    {
        'Person ID': '@I1@',
        'First Name': 'John',
        'Last Name': 'Doe',
        'Birth Date': '1990',
        'Birth Place': 'New York',
        'Death Date': '',
        'Death Place': '',
        'Father First Name': 'Robert',
        'Father Last Name': 'Doe',
        'Mother First Name': 'Jane',
        'Mother Last Name': 'Smith'
    },
    # ... more people
]

get_person_sources()

Returns a list of dictionaries with person-source relationships:

[
    {
        'Person ID': '@I1@',
        'Source ID': '@S1@',
        'Source Title': 'Birth Certificate',
        'Source Author': '',
        'Source Publication': 'City Records',
        'Source Repository': 'City Hall'
    },
    # ... more person-source combinations
]

Example Usage

Basic Person Data

from gedcom import GedcomParser

parser = GedcomParser()
parser.parse_file('my_family.ged')

# Get all people
people = parser.get_person_list()

# Print summary
print(f"Found {len(people)} people in the family tree")

# Look at first person
if people:
    person = people[0]
    print(f"Name: {person['First Name']} {person['Last Name']}")
    print(f"Born: {person['Birth Date']} in {person['Birth Place']}")

With Pandas

import pandas as pd
from gedcom import GedcomParser

parser = GedcomParser()
parser.parse_file('my_family.ged')

# Create DataFrames
people_df = pd.DataFrame(parser.get_person_list())
sources_df = pd.DataFrame(parser.get_person_sources())

# Analyze the data
print("Birth places:")
print(people_df['Birth Place'].value_counts())

print("\nSource types:")
print(sources_df['Source Title'].value_counts())

Requirements

  • Python 3.6+
  • No external dependencies for core functionality
  • pandas (optional, for DataFrame examples)

License

This project is licensed under the GNU General Public License v2.0 - see the LICENSE file for details.

Attribution

This project is derived from python-gedcom by Nicklas Reincke and contributors. The original project provided the foundation for GEDCOM parsing, which has been simplified and focused for specific genealogy data extraction use cases.

Original Copyright (C) 2018-2019 Nicklas Reincke and contributors
Simplified version Copyright (C) 2025 [mcobtechnology]

Contributing

This is a simplified, focused library. If you need additional GEDCOM functionality, consider using the full-featured python-gedcom library.

For bug fixes and improvements to the core functionality, feel free to open issues or submit pull requests.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simple-gedcom-1.0.1.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

simple_gedcom-1.0.1-py3-none-any.whl (13.6 kB view details)

Uploaded Python 3

File details

Details for the file simple-gedcom-1.0.1.tar.gz.

File metadata

  • Download URL: simple-gedcom-1.0.1.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.10.0 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/1.0.0 urllib3/1.26.20 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.9

File hashes

Hashes for simple-gedcom-1.0.1.tar.gz
Algorithm Hash digest
SHA256 09108fe25fc0bd390b9a19527d24a41c9e59bbdf36e5419cd34b6b0ac7e94f06
MD5 a2095eff042004a11705ed08406daa4f
BLAKE2b-256 d003b989735d7854ef13e36b0d2e6ce686d1d44ea554e1f4f65175b36311a7a9

See more details on using hashes here.

File details

Details for the file simple_gedcom-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: simple_gedcom-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 13.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.10.0 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/1.0.0 urllib3/1.26.20 tqdm/4.64.1 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.5 CPython/3.6.9

File hashes

Hashes for simple_gedcom-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 35d4fb010c6bdbc7fda66ec9ddf8d058183ebd5caa9e2f6cb689c2895ce440bf
MD5 b72648686f2e0796db50420f42e0df7a
BLAKE2b-256 9bc72f0a932abfb9f7080dd639fa609b9a8373495cba0f6143a1427aac73f3e5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page