Skip to main content

Extract your Garmin Connect health data to a local SQLite database

Project description

garmin-health-data

Extract your complete Garmin Connect health data to a local SQLite database.

Adapted from the Garmin pipeline in OpenETL, a comprehensive ETL framework with Apache Airflow and PostgreSQL/TimescaleDB. This standalone version of the OpenETL Garmin data pipeline provides the same data extraction and modeling scheme without requiring Airflow or PostgreSQL infrastructure. Built on python-garminconnect for Garmin Connect API usage and Garth for OAuth authentication.

Features

  • โšก Zero Configuration: Single command to get started
  • ๐Ÿฅ Comprehensive Health Data: Sleep, HRV, stress, body battery, heart rate, respiration, VO2 max, training metrics
  • ๐Ÿƒ Activity Data: FIT files with detailed time-series metrics, lap data, split data
  • ๐Ÿ’พ Local Storage: SQLite database - your data stays on your machine
  • ๐Ÿ”„ Auto-Resume: Automatically detects last update and syncs new data
  • ๐Ÿ–ฅ๏ธ Cross-Platform: Works on macOS, Linux, Windows
  • ๐Ÿ›ก๏ธ Privacy-First: No cloud services, no third parties

Requirements

  • Python 3.9 or higher
  • Garmin Connect account
  • Internet connection for data extraction

Quick Start

Installation

pip install garmin-health-data

First-Time Setup

# Authenticate with Garmin Connect (one-time setup)
garmin auth

You'll be prompted for your Garmin Connect email and password. Your credentials are used only to obtain OAuth tokens, which are stored locally in ~/.garminconnect/.

Extract Your Data

# Extract all available data
garmin extract

# View database statistics
garmin info

That's it! Your data is now in a local SQLite database (garmin_data.db).

Usage

Authentication

# Interactive authentication (one-time setup)
garmin auth

# If you have MFA enabled, you'll be prompted for your code
  • garmin auth always performs a fresh login and refreshes your OAuth tokens, even if valid tokens already exist.
  • Tokens are stored locally in ~/.garminconnect/ and are valid for approximately 1 year.
  • You typically only need to run garmin auth once initially or when tokens expire.
  • garmin extract automatically checks for existing tokens and only prompts for authentication if they're missing.
  • Recommendation: Run garmin auth once for initial setup, then just use garmin extract for regular data extraction.

Data Extraction

# Auto-detect date range (extracts from last update to today)
garmin extract

# Specify custom date range
garmin extract --start-date 2024-01-01 --end-date 2024-12-31

# Extract specific data types only
garmin extract --data-types SLEEP --data-types HEART_RATE --data-types ACTIVITY

# Use custom database location
garmin extract --db-path ~/my-garmin-data.db

Date Range Behavior

The date range parameters --start-date and --end-date define the period for data extraction:

  • --start-date: Inclusive, data from this date is included.
  • --end-date: Exclusive, data from this date is NOT included (except when start and end dates are the same, then inclusive).
  • Example: --start-date 2024-01-01 --end-date 2024-01-31 extracts Jan 1-30 (31st excluded).
  • Example: --start-date 2024-01-15 --end-date 2024-01-15 extracts Jan 15 only (same-day inclusive).

Automatic Date Detection

One of the key features of garmin-health-data is that you can run garmin extract anytime without specifying dates, and it automatically continues from where it left off:

  1. First Run (Empty Database)

    • Extracts the last 30 days of data.
    • Creates your initial database.
  2. Subsequent Runs (Existing Data)

    • Queries 10 core time-series tables (sleep, heart_rate, activity, stress, body_battery, steps, respiration, floors, intensity_minutes, training_readiness).
    • Finds the most recent (maximum) date across these tables.
    • Automatically starts from the day after this maximum date.
    • Extracts up to today.

This approach assumes that each automatic extraction covers all data types up to the maximum date, even if some specific data types have no data for certain days (e.g., no activities recorded, no training readiness calculated). Using the maximum date ensures:

  • Only new data is extracted (efficient, no redundant API calls).
  • Gaps in specific data types are automatically filled when available.
  • Simple, predictable behavior for users.

Example:

If your database has sleep data through Dec 20th but activities only through Dec 18th (you didn't exercise on Dec 19-20), the next extraction starts from Dec 21st. This is correct because:

  • Sleep data for Dec 19-20 was already extracted.
  • No activity data exists for Dec 19-20 (you didn't exercise).
  • The Dec 21st extraction will get all available data for that day.

Duplicate Prevention & Reprocessing

This package prevents duplicates through a three-tier approach:

  1. FIT Activity Time-Series: Tracks processed files with ts_data_available flag. Skips already-processed files automatically on re-run.
  2. JSON Wellness Time-Series: Uses INSERT...ON CONFLICT DO NOTHING for idempotent upserts. Reprocessing the same date won't create duplicates.
  3. Main Records (activities, sleep): Uses INSERT...ON CONFLICT DO UPDATE to update existing records with new data.

This means you can safely:

  • Reprocess dates without creating duplicate time-series points
  • Backfill missing data by re-extracting date ranges
  • Retry failed extractions without manual cleanup

GarminDB comparison: GarminDB uses SQLAlchemy session.merge() operations (via insert_or_update() methods) that handle duplicates at the ORM level. However, this behavior is not explicitly documented. garmin-health-data uses explicit SQL-level ON CONFLICT clauses that make idempotency guarantees clear and verifiable at the database level.

Data Types

You can limit extraction to specific data types using the --data-types parameter. If omitted, all data types are extracted. The --data-types parameter accepts the exact values from the "Data Type" column in the Data Types table below (e.g., SLEEP, HEART_RATE, ACTIVITY, STRESS, etc.).

View Database Info

# Show statistics and last update dates
garmin info

Last Update Dates:
   โ€ข Activity: 2024-12-18          # Haven't exercised in 2 days
   โ€ข Body Battery: 2024-12-20       # Up to date
   โ€ข Floors: 2024-12-20             # Up to date
   โ€ข Heart Rate: 2024-12-20         # Up to date
   โ€ข Sleep: 2024-12-20              # Up to date
   โ€ข Steps: 2024-12-20              # Up to date
   โ€ข Stress: 2024-12-20             # Up to date
   ...

# Check specific database
garmin info --db-path ~/my-garmin-data.db

Next garmin extract will start from 2024-12-21 (the day after the maximum date, 2024-12-20), ensuring all data types are updated.

Example Workflow

# Week 1: Initial extraction
$ garmin extract
๐Ÿ“… Using default start date: 2024-11-20 (30 days ago)
๐Ÿ“† Date range: 2024-11-20 to 2024-12-20
โœ… Extracted 1,234 files

# Week 2: Automatic resume (just run the same command!)
$ garmin extract
๐Ÿ“… Auto-detected start date: 2024-12-21 (day after last update)
๐Ÿ“† Date range: 2024-12-21 to 2024-12-27
โœ… Extracted 87 files  # Only new data!

# Week 3: Missed a few days? No problem!
$ garmin extract
๐Ÿ“… Auto-detected start date: 2024-12-28 (day after last update)
๐Ÿ“† Date range: 2024-12-28 to 2025-01-10
โœ… Extracted 156 files  # Automatically fills the gap

Data Types

Data Type Description Frequency
SLEEP Sleep stages, HRV, SpO2, restlessness, scores Per session
HEART_RATE Continuous heart rate measurements 2-min intervals
STRESS Stress levels throughout the day 3-min intervals
RESPIRATION Breathing rate measurements 2-min intervals
TRAINING_READINESS Readiness scores and factors Daily
TRAINING_STATUS VO2 max, load balance, ACWR Daily
STEPS Step counts and activity levels 15-min intervals
FLOORS Floors climbed and descended 15-min intervals
INTENSITY_MINUTES Moderate/vigorous activity minutes 15-min intervals
ACTIVITIES_LIST Detailed activity summaries Per activity
PERSONAL_RECORDS All-time bests across sports As achieved
RACE_PREDICTIONS Predicted race times Periodic updates
USER_PROFILE Demographics, fitness metrics Periodic updates
ACTIVITY Binary FIT files with detailed time-series sensor data Per activity

Database Schema

The SQLite database contains 29 tables organized by category. The complete schema is defined in garmin_health_data/models.py.

SQLite Adaptations

The database schema has been adapted from the original PostgreSQL/TimescaleDB schema in OpenETL to be fully compatible with SQLite, while preserving all relationships and data integrity. Key adaptations include:

  • Removed PostgreSQL schemas - SQLite doesn't support schemas, all tables are in the default namespace.
  • Converted SERIAL to AUTOINCREMENT - PostgreSQL SERIAL types converted to SQLite INTEGER PRIMARY KEY AUTOINCREMENT.
  • Replaced TimescaleDB hypertables - Time-series tables use regular SQLite tables with indexes on timestamp columns for efficient queries.
  • SQLite-compatible upsert syntax - Uses SQLite's INSERT ... ON CONFLICT for handling duplicate records.
  • Preserved all relationships - All foreign key relationships and table structures maintained.

These adaptations ensure the standalone application maintains complete feature parity with the OpenETL Garmin pipeline while using a zero-configuration SQLite database.

Table Structure

User & Profile (2 tables)

user (root table)
โ””โ”€โ”€ user_profile (fitness profile, physical characteristics)

Foreign keys: user_profile โ†’ user.user_id

Activities (8 tables)

activity (main activity records)
โ”œโ”€โ”€ activity_lap_metric (lap-by-lap metrics)
โ”œโ”€โ”€ activity_split_metric (split data)
โ”œโ”€โ”€ activity_ts_metric (time-series sensor data)
โ”œโ”€โ”€ cycling_agg_metrics (cycling-specific aggregates)
โ”œโ”€โ”€ running_agg_metrics (running-specific aggregates)
โ”œโ”€โ”€ swimming_agg_metrics (swimming-specific aggregates)
โ””โ”€โ”€ supplemental_activity_metric (additional activity metrics)

Foreign keys: activity โ†’ user.user_id; all child tables โ†’ activity.activity_id

Sleep Metrics (6 tables)

sleep (main sleep sessions)
โ”œโ”€โ”€ sleep_movement (movement during sleep)
โ”œโ”€โ”€ sleep_restless_moment (restless periods)
โ”œโ”€โ”€ spo2 (blood oxygen saturation)
โ”œโ”€โ”€ hrv (heart rate variability)
โ””โ”€โ”€ breathing_disruption (breathing events)

Foreign keys: sleep โ†’ user.user_id; all child tables โ†’ sleep.sleep_id

Health Time-Series (7 tables)

heart_rate (continuous heart rate measurements)
stress (stress level readings)
body_battery (energy level tracking)
respiration (breathing rate data)
steps (step counts and activity levels)
floors (floors climbed/descended)
intensity_minutes (activity intensity tracking)

Foreign keys: all tables โ†’ user.user_id

Training Metrics (4 tables)

vo2_max (VO2 max estimates)
โ”œโ”€โ”€ acclimation (heat/altitude acclimation)
โ”œโ”€โ”€ training_load (training load metrics)
โ””โ”€โ”€ training_readiness (daily readiness scores)

Foreign keys: all tables โ†’ user.user_id

Records & Predictions (2 tables)

personal_record (personal bests)
race_predictions (predicted race times)

Foreign keys: all tables โ†’ user.user_id; personal_record โ†’ activity.activity_id (optional)

Privacy & Security

  • Your credentials never leave your machine: they're only used to obtain OAuth tokens via garth, stored locally in ~/.garminconnect/.
  • All data stays on your machine: no cloud services involved.
  • No analytics or tracking: this tool doesn't send any data anywhere except querying the Garmin Connect API using the wrapper python-garminconnect.

Comparison with Other Tools

Feature garmin-health-data garmindb garminexport garmin-fetch
Interface CLI CLI CLI GUI
Setup complexity โœ… Single command โš ๏ธ Config file + 2 commands โœ… Single command โš ๏ธ Manual setup
Storage SQLite database SQLite database File export Excel export
Cross-platform โœ… โœ… โœ… โœ…
Health metrics (sleep, HRV, stress) โœ… Comprehensive โš ๏ธ Basic coverage โŒ Activities only โŒ Activities only
Sleep data granularity โœ… 6 tables, 1-min intervals โš ๏ธ 2 tables, less granular โŒ โŒ
FIT file time-series data โœ… All metrics (EAV schema) โš ๏ธ Limited (~10 core fields) โŒ โŒ
Power meter & advanced metrics โœ… Full support โŒ Not captured โŒ โŒ
Database schema quality โœ… Normalized, 29 tables โš ๏ธ ~31 tables, mixed normalization N/A N/A
Duplicate prevention โœ… Explicit SQL ON CONFLICT โš ๏ธ ORM merge (undocumented) N/A N/A
Auto-resume โœ… โœ… โœ… โŒ
Active maintenance โœ… โœ… โœ… โš ๏ธ Limited

Want the full data pipeline with Airflow, scheduled updates, and TimescaleDB? Check out OpenETL's Garmin pipeline.

Detailed Schema Comparison: garmin-health-data vs garmindb

1. Activity Time-Series Data - Major Differentiator

garmin-health-data uses a flexible EAV (Entity-Attribute-Value) schema in the activity_ts_metric table:

  • Schema: (activity_id, timestamp, name, value, units).
  • Captures ALL FIT file metrics: heart rate, power, cadence, GPS coordinates, advanced running dynamics (ground contact time, vertical oscillation, stride length), cycling power metrics (left/right balance, pedal smoothness), swimming metrics, and more.
  • Future-proof: Automatically handles any new metrics Garmin adds without requiring schema changes.
  • Example: A cycling activity with a power meter captures power, left_right_balance, left_pedal_smoothness, right_pedal_smoothness, left_torque_effectiveness, right_torque_effectiveness, etc.

garmindb uses a fixed column schema in the ActivityRecords table:

  • Only ~10 predefined columns: hr, cadence, speed, distance, altitude, temperature, position_lat, position_long, rr.
  • Missing critical data: No power data, no advanced running/cycling dynamics, no device-specific metrics.
  • Limited extensibility: Requires schema changes and code updates to add new metrics.

2. Sleep Data Granularity

garmin-health-data provides comprehensive sleep tracking with 6 tables:

  • sleep: Main sleep session with scores and metadata.
  • sleep_movement: 1-minute interval movement data throughout sleep.
  • hrv: 5-minute interval heart rate variability measurements.
  • spo2: 1-minute interval blood oxygen saturation.
  • breathing_disruption: Event-based breathing disruption timestamps.
  • sleep_restless_moment: Event-based restless moment timestamps.

garmindb uses only 2 tables:

  • Sleep: Main sleep session data.
  • SleepEvents: Sleep events (less granular than garmin-health-data's separate time-series tables).

3. Health Time-Series Organization

garmin-health-data uses separate normalized tables for each metric type:

  • Each metric type (heart_rate, stress, body_battery, respiration, steps, floors, intensity_minutes) has its own table.
  • Consistent schema: (user_id, timestamp, value) plus metric-specific fields.
  • Optimized for time-series queries and analysis.

garmindb uses a mixed approach:

  • Some monitoring tables for specific metrics.
  • Wide DailySummary table containing many aggregated metrics in a single row.
  • Less optimized for granular time-series analysis.

4. Update Strategy & Data Integrity

garmin-health-data uses explicit conflict resolution for idempotent reprocessing:

  • Updatable data (activities, user profile, training status): Uses ON CONFLICT UPDATE to refresh data when reprocessing.
  • Immutable time-series (heart rate, sleep movement, stress): Uses ON CONFLICT DO NOTHING to prevent duplicates.
  • FIT activity time-series: Uses ts_data_available flag check to skip reprocessing, preventing duplicate records entirely.
  • Latest flags: Manages latest=True flags for user_profile, personal_record, race_predictions to track most recent values.
  • Referential integrity: Explicit foreign key relationships with cascade deletes.
  • Fully idempotent: Safe to reprocess the same date range multiple times without creating duplicate data.

garmindb update strategy:

  • Uses SQLAlchemy session.merge() operations via insert_or_update() and s_insert_or_update() methods.
  • Handles duplicates at the ORM level rather than explicit SQL constraints.
  • Implementation detail not documented in README or schema documentation.
  • Idempotency behavior exists but is implicit rather than guaranteed at database level.

5. Sport-Specific Metrics

garmin-health-data provides dedicated tables for each sport:

  • running_agg_metrics: Running cadence, vertical oscillation, ground contact time, stride length, VO2 max.
  • cycling_agg_metrics: Power metrics (avg/max/normalized), cadence, pedal dynamics, FTP.
  • swimming_agg_metrics: Stroke count, SWOLF, pool length, stroke type.

garmindb uses activity-type tables:

  • StepsActivities, PaddleActivities, CycleActivities, ClimbingActivities
  • Less comprehensive sport-specific metrics

Bottom Line: If you use power meters, care about advanced running/cycling metrics, or want comprehensive sleep analysis, garmin-health-data's superior schema design captures significantly more data at higher granularity.

Contributing

Contributions are welcome! Please note:

  • Data extraction and processing logic is synchronized with the openetl Garmin pipeline
  • For changes to extraction/processing logic, please contribute to openetl first, as this application is a wrapper that provides a standalone CLI
  • For CLI-specific features, documentation, or packaging improvements, feel free to contribute directly here

Please feel free to submit a Pull Request.

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

garmin_health_data-1.0.0.tar.gz (64.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

garmin_health_data-1.0.0-py3-none-any.whl (56.8 kB view details)

Uploaded Python 3

File details

Details for the file garmin_health_data-1.0.0.tar.gz.

File metadata

  • Download URL: garmin_health_data-1.0.0.tar.gz
  • Upload date:
  • Size: 64.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for garmin_health_data-1.0.0.tar.gz
Algorithm Hash digest
SHA256 2e008a65bd3abd423a5183c6ae8602cfa0d8932a55115212632983073124f422
MD5 42844abf487f76a86e153abf072a0bd5
BLAKE2b-256 bb505f459f982e07e943b24607a89c1175316c41f1f08c8a6f14f8fe6822e8a0

See more details on using hashes here.

File details

Details for the file garmin_health_data-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for garmin_health_data-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 81b60f08a249a3ff7fc40379450e37d3075433449d0cb93d0fac5386eb7cb166
MD5 20e787793e14a7ec00c0a078f2b4c176
BLAKE2b-256 3dbd8b3c69d915dd14dcb86664b1a156625328509ce77b665b5f9c07ea3457d4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page