Skip to main content

Extract your Garmin Connect health data to a local SQLite database

Project description

Extract your complete Garmin Connect health and activity data to a local SQLite database.

Adapted from the Garmin pipeline in OpenETL, a comprehensive ETL framework with Apache Airflow and PostgreSQL/TimescaleDB. This standalone version of the OpenETL Garmin data pipeline provides the same data extraction and modeling scheme without requiring Airflow or PostgreSQL infrastructure. Built on python-garminconnect for Garmin Connect API usage and Garth for OAuth authentication.

Features

  • โšก Zero Configuration: Single command to get started.
  • ๐Ÿ–ฅ๏ธ Cross-Platform: Works on macOS, Linux, Windows.
  • ๐Ÿ’พ Local Storage: SQLite database - your data stays on your machine.
  • ๐Ÿฅ Comprehensive Health Data: Sleep, HRV, stress, body battery, heart rate, respiration, VO2 max, training metrics.
  • ๐Ÿƒ Activity Data: FIT files with detailed time-series metrics, lap data, split data.
  • ๐Ÿ”„ Auto-Resume: Automatically detects last update and syncs new data.

Requirements

  • Python 3.9 or higher
  • Garmin Connect account
  • Internet connection for data extraction

Quick Start

Installation

pip install garmin-health-data

First-Time Setup

# Authenticate with Garmin Connect (one-time setup)
garmin auth

You'll be prompted for your Garmin Connect email and password. Your credentials are used only to obtain OAuth tokens, which are stored locally in ~/.garminconnect/.

Extract Your Data

# Extract all available data
garmin extract

# View database statistics
garmin info

That's it! Your data is now in a local SQLite database (garmin_data.db).

Usage

Authentication

# Interactive authentication (one-time setup)
garmin auth

# If you have MFA enabled, you'll be prompted for your code
  • garmin auth always performs a fresh login and refreshes your OAuth tokens, even if valid tokens already exist.
  • Tokens are stored locally in ~/.garminconnect/ and are valid for approximately 1 year.
  • You typically only need to run garmin auth once initially or when tokens expire.
  • garmin extract automatically checks for existing tokens and only prompts for authentication if they're missing.
  • Recommendation: Run garmin auth once for initial setup, then just use garmin extract for regular data extraction.

Data Extraction

# Auto-detect date range (extracts from last update to today)
garmin extract

# Specify custom date range
garmin extract --start-date 2024-01-01 --end-date 2024-12-31

# Extract specific data types only
garmin extract --data-types SLEEP --data-types HEART_RATE --data-types ACTIVITY

# Use custom database location
garmin extract --db-path ~/my-garmin-data.db

Date Range Behavior

The date range parameters --start-date and --end-date define the period for data extraction:

  • --start-date: Inclusive, data from this date is included.
  • --end-date: Exclusive, data from this date is NOT included (except when start and end dates are the same, then inclusive).
  • Example: --start-date 2024-01-01 --end-date 2024-01-31 extracts Jan 1-30 (31st excluded).
  • Example: --start-date 2024-01-15 --end-date 2024-01-15 extracts Jan 15 only (same-day inclusive).

Automatic Date Detection

One of the key features of garmin-health-data is that you can run garmin extract anytime without specifying dates, and it automatically continues from where it left off:

  1. First Run (Empty Database)

    • Extracts the last 30 days of data.
    • Creates your initial database.
  2. Subsequent Runs (Existing Data)

    • Queries 10 core time-series tables (sleep, heart_rate, activity, stress, body_battery, steps, respiration, floors, intensity_minutes, training_readiness).
    • Finds the most recent (maximum) date across these tables.
    • Automatically starts from the day after this maximum date.
    • Extracts up to today.

This approach assumes that each automatic extraction covers all data types up to the maximum date, even if some specific data types have no data for certain days (e.g., no activities recorded, no training readiness calculated). Using the maximum date ensures:

  • Only new data is extracted (efficient, no redundant API calls).
  • Gaps in specific data types are automatically filled when available.
  • Simple, predictable behavior for users.

Example:

If your database has sleep data through Dec 20th but activities only through Dec 18th (you didn't exercise on Dec 19-20), the next extraction starts from Dec 21st. This is correct because:

  • Sleep data for Dec 19-20 was already extracted.
  • No activity data exists for Dec 19-20 (you didn't exercise).
  • The Dec 21st extraction will get all available data for that day.

Duplicate Prevention & Reprocessing

This package prevents duplicates through a three-tier approach:

  1. FIT Activity Time-Series: Tracks processed files with ts_data_available flag. Skips already-processed files automatically on re-run.
  2. JSON Wellness Time-Series: Uses INSERT...ON CONFLICT DO NOTHING for idempotent upserts. Reprocessing the same date won't create duplicates.
  3. Main Records (activities, sleep): Uses INSERT...ON CONFLICT DO UPDATE to update existing records with new data.

This means you can safely:

  • Reprocess dates without creating duplicate time-series points
  • Backfill missing data by re-extracting date ranges
  • Retry failed extractions without manual cleanup

GarminDB comparison: GarminDB uses SQLAlchemy session.merge() operations (via insert_or_update() methods) that handle duplicates at the ORM level. However, this behavior is not explicitly documented. garmin-health-data uses explicit SQL-level ON CONFLICT clauses that make idempotency guarantees clear and verifiable at the database level.

Data Types

You can limit extraction to specific data types using the --data-types parameter. If omitted, all data types are extracted. The --data-types parameter accepts the exact values from the "Data Type" column in the Data Types table below (e.g., SLEEP, HEART_RATE, ACTIVITY, STRESS, etc.).

View Database Info

# Show statistics and last update dates
garmin info

Last Update Dates:
   โ€ข Activity: 2024-12-18          # Haven't exercised in 2 days
   โ€ข Body Battery: 2024-12-20       # Up to date
   โ€ข Floors: 2024-12-20             # Up to date
   โ€ข Heart Rate: 2024-12-20         # Up to date
   โ€ข Sleep: 2024-12-20              # Up to date
   โ€ข Steps: 2024-12-20              # Up to date
   โ€ข Stress: 2024-12-20             # Up to date
   ...

# Check specific database
garmin info --db-path ~/my-garmin-data.db

Next garmin extract will start from 2024-12-21 (the day after the maximum date, 2024-12-20), ensuring all data types are updated.

Example Workflow

# Week 1: Initial extraction
$ garmin extract
๐Ÿ“… Using default start date: 2024-11-20 (30 days ago)
๐Ÿ“† Date range: 2024-11-20 to 2024-12-20
โœ… Extracted 1,234 files

# Week 2: Automatic resume (just run the same command!)
$ garmin extract
๐Ÿ“… Auto-detected start date: 2024-12-21 (day after last update)
๐Ÿ“† Date range: 2024-12-21 to 2024-12-27
โœ… Extracted 87 files  # Only new data!

# Week 3: Missed a few days? No problem!
$ garmin extract
๐Ÿ“… Auto-detected start date: 2024-12-28 (day after last update)
๐Ÿ“† Date range: 2024-12-28 to 2025-01-10
โœ… Extracted 156 files  # Automatically fills the gap

Data Types

Data Type Description Frequency
SLEEP Sleep stages, HRV, SpO2, restlessness, scores Per session
HEART_RATE Continuous heart rate measurements 2-min intervals
STRESS Stress levels throughout the day 3-min intervals
RESPIRATION Breathing rate measurements 2-min intervals
TRAINING_READINESS Readiness scores and factors Daily
TRAINING_STATUS VO2 max, load balance, ACWR Daily
STEPS Step counts and activity levels 15-min intervals
FLOORS Floors climbed and descended 15-min intervals
INTENSITY_MINUTES Moderate/vigorous activity minutes 15-min intervals
ACTIVITIES_LIST Detailed activity summaries Per activity
PERSONAL_RECORDS All-time bests across sports As achieved
RACE_PREDICTIONS Predicted race times Periodic updates
USER_PROFILE Demographics, fitness metrics Periodic updates
ACTIVITY Binary FIT files with detailed time-series sensor data Per activity

Database Schema

The SQLite database contains 29 tables organized by category. The complete schema is defined in garmin_health_data/tables.ddl following the same pattern as the openetl project. The schema includes inline documentation comments for all tables and columns, which are preserved in the SQLite database.

Viewing inline documentation:

# View schema for a specific table
sqlite3 ~/garmin_data.db "SELECT sql FROM sqlite_master WHERE type='table' AND name='personal_record';"

# View all table schemas
sqlite3 ~/garmin_data.db "SELECT sql FROM sqlite_master WHERE type='table';"

The schema is automatically created when you initialize the database.

SQLite Adaptations

The database schema has been adapted from the original PostgreSQL/TimescaleDB schema in OpenETL to be fully compatible with SQLite, while preserving all relationships and data integrity. Key adaptations include:

  • Removed PostgreSQL schemas - SQLite doesn't support schemas, all tables are in the default namespace.
  • Converted SERIAL to AUTOINCREMENT - PostgreSQL SERIAL types converted to SQLite INTEGER PRIMARY KEY AUTOINCREMENT.
  • Replaced TimescaleDB hypertables - Time-series tables use regular SQLite tables with indexes on timestamp columns for efficient queries.
  • SQLite-compatible upsert syntax - Uses SQLite's INSERT ... ON CONFLICT for handling duplicate records.
  • Preserved all relationships - All foreign key relationships and table structures maintained.

These adaptations ensure the standalone application maintains complete feature parity with the OpenETL Garmin pipeline while using a zero-configuration SQLite database.

Table Structure

User & Profile (2 tables)

user (root table)
โ””โ”€โ”€ user_profile (fitness profile, physical characteristics)

Foreign keys: user_profile โ†’ user.user_id

Activities (8 tables)

activity (main activity records)
โ”œโ”€โ”€ activity_lap_metric (lap-by-lap metrics)
โ”œโ”€โ”€ activity_split_metric (split data)
โ”œโ”€โ”€ activity_ts_metric (time-series sensor data)
โ”œโ”€โ”€ cycling_agg_metrics (cycling-specific aggregates)
โ”œโ”€โ”€ running_agg_metrics (running-specific aggregates)
โ”œโ”€โ”€ swimming_agg_metrics (swimming-specific aggregates)
โ””โ”€โ”€ supplemental_activity_metric (additional activity metrics)

Foreign keys: activity โ†’ user.user_id; all child tables โ†’ activity.activity_id

Sleep Metrics (6 tables)

sleep (main sleep sessions)
โ”œโ”€โ”€ sleep_movement (movement during sleep)
โ”œโ”€โ”€ sleep_restless_moment (restless periods)
โ”œโ”€โ”€ spo2 (blood oxygen saturation)
โ”œโ”€โ”€ hrv (heart rate variability)
โ””โ”€โ”€ breathing_disruption (breathing events)

Foreign keys: sleep โ†’ user.user_id; all child tables โ†’ sleep.sleep_id

Health Time-Series (7 tables)

heart_rate (continuous heart rate measurements)
stress (stress level readings)
body_battery (energy level tracking)
respiration (breathing rate data)
steps (step counts and activity levels)
floors (floors climbed/descended)
intensity_minutes (activity intensity tracking)

Foreign keys: all tables โ†’ user.user_id

Training Metrics (4 tables)

vo2_max (VO2 max estimates)
โ”œโ”€โ”€ acclimation (heat/altitude acclimation)
โ”œโ”€โ”€ training_load (training load metrics)
โ””โ”€โ”€ training_readiness (daily readiness scores)

Foreign keys: all tables โ†’ user.user_id

Records & Predictions (2 tables)

personal_record (personal bests)
race_predictions (predicted race times)

Foreign keys: all tables โ†’ user.user_id; personal_record โ†’ activity.activity_id (optional)

Privacy & Security

  • Your credentials never leave your machine: they're only used to obtain OAuth tokens via garth, stored locally in ~/.garminconnect/.
  • All data stays on your machine: no cloud services involved.
  • No analytics or tracking: this tool doesn't send any data anywhere except querying the Garmin Connect API using the wrapper python-garminconnect.

Comparison With Other Tools

garmin-health-data is designed for comprehensive data extraction with a well-structured relational schema that supports both human-powered analytics and LLM-powered analysis via agents querying the locally created SQLite file. It extracts complete FIT file data with per-second activity metrics, 1-minute sleep intervals, and sport-specific tables for detailed analysis. The normalized 29-table schema with explicit SQL constraints ensures data integrity and makes it easy to understand relationships for complex queries, power zone analysis, running dynamics, and long-term trend studies.

garmy is optimized for programmatic access to the Garmin Connect API, particularly useful for AI assistant integration via its built-in MCP (Model Context Protocol) server. It enables real-time interaction with Claude Desktop or custom chatbots for quick daily insights and summaries. However, it's limited to API-provided metrics (daily aggregates only, no FIT file access), making deep analytics or granular time-series analysis impossible. Best suited for lightweight health monitoring apps that prioritize AI integration over comprehensive data collection.

garmindb is a mature and well-documented tool, but has been functionally superseded by garmin-health-data. While it pioneered local Garmin data extraction, it offers less comprehensive schemas (missing power meter data, limited FIT metrics) and uses implicit duplicate handling at the ORM level rather than explicit database constraints. For new projects requiring detailed data extraction and analysis, garmin-health-data is the recommended choice.

Want the full data pipeline with Airflow, scheduled updates, and TimescaleDB? Check out OpenETL's Garmin pipeline.

Feature garmin-health-data garmindb garmy garminexport garmin-fetch
Interface CLI CLI CLI + Python API + MCP CLI GUI
Setup complexity โœ… Single command โš ๏ธ Config file + 2 commands โœ… Single command โœ… Single command โš ๏ธ Manual setup
Storage SQLite database SQLite database SQLite (optional) File export Excel export
Cross-platform โœ… โœ… โœ… โœ… โœ…
Health metrics (sleep, HRV, stress) โœ… Comprehensive โš ๏ธ Basic coverage โš ๏ธ Basic coverage โŒ Activities only โŒ Activities only
Sleep data granularity โœ… 6 tables, 1-min intervals โš ๏ธ 2 tables, less granular โš ๏ธ 1 table, daily aggregate โŒ โŒ
FIT file time-series data โœ… All metrics (EAV schema) โš ๏ธ Limited (~10 core fields) โŒ API-only (no FIT files) โŒ โŒ
Power meter & advanced metrics โœ… Full support โŒ Not captured โŒ API limitations โŒ โŒ
Database schema quality โœ… Normalized, 29 tables โš ๏ธ ~31 tables, mixed normalization โŒ Very simple N/A N/A
Duplicate prevention โœ… Explicit SQL ON CONFLICT โš ๏ธ ORM merge (undocumented) โœ… ORM merge + sync tracking N/A N/A
Auto-resume โœ… โœ… โœ… โœ… โŒ
Active maintenance โœ… โœ… โœ… โœ… โš ๏ธ Limited

Schema Comparison: garmin-health-data vs garmindb vs garmy

Activity Time-Series Data

garmin-health-data uses a flexible EAV (Entity-Attribute-Value) schema in the activity_ts_metric table:

  • Schema: (activity_id, timestamp, name, value, units).
  • Captures ALL FIT file metrics: heart rate, power, cadence, GPS coordinates, advanced running dynamics (ground contact time, vertical oscillation, stride length), cycling power metrics (left/right balance, pedal smoothness), swimming metrics, and more.
  • Future-proof: Automatically handles any new metrics Garmin adds without requiring schema changes.
  • Example: A cycling activity with a power meter captures power, left_right_balance, left_pedal_smoothness, right_pedal_smoothness, left_torque_effectiveness, right_torque_effectiveness, etc.

garmindb uses a fixed column schema in the ActivityRecords table:

  • Only ~10 predefined columns: hr, cadence, speed, distance, altitude, temperature, position_lat, position_long, rr.
  • Missing critical data: No power data, no advanced running/cycling dynamics, no device-specific metrics.
  • Limited extensibility: Requires schema changes and code updates to add new metrics.

garmy (API-only approach):

  • No per-second activity data: API provides only aggregated summaries (avg/max HR, duration, training load).
  • No FIT file access: Cannot capture detailed time-series metrics that exist only in device files.

Sport-Specific Metrics

garmin-health-data provides dedicated tables for each sport:

  • running_agg_metrics: Running cadence, vertical oscillation, ground contact time, stride length, VO2 max.
  • cycling_agg_metrics: Power metrics (avg/max/normalized), cadence, pedal dynamics, FTP.
  • swimming_agg_metrics: Stroke count, SWOLF, pool length, stroke type.

garmindb uses activity-type tables:

  • StepsActivities, PaddleActivities, CycleActivities, ClimbingActivities
  • Less comprehensive sport-specific metrics

garmy uses basic activity records:

  • activities: Simple table with activity name, duration, avg HR, training load.
  • No sport-specific metrics: API doesn't provide detailed power/cadence/dynamics data.

Sleep Data Granularity

garmin-health-data provides comprehensive sleep tracking with 6 tables:

  • sleep: Main sleep session with scores and metadata.
  • sleep_movement: 1-minute interval movement data throughout sleep.
  • hrv: 5-minute interval heart rate variability measurements.
  • spo2: 1-minute interval blood oxygen saturation.
  • breathing_disruption: Event-based breathing disruption timestamps.
  • sleep_restless_moment: Event-based restless moment timestamps.

garmindb uses only 2 tables:

  • Sleep: Main sleep session data.
  • SleepEvents: Sleep events (less granular than garmin-health-data's separate time-series tables).

garmy uses 1 table with daily aggregates:

  • daily_health_metrics: Single row per day with summary columns (total hours, deep/light/REM percentages).
  • No per-minute data: Cannot analyze sleep cycles, movement patterns, or SpO2 fluctuations throughout the night.

Health Time-Series Organization

garmin-health-data uses separate normalized tables for each metric type:

  • Each metric type (heart_rate, stress, body_battery, respiration, steps, floors, intensity_minutes) has its own table.
  • Consistent schema: (user_id, timestamp, value) plus metric-specific fields.
  • Optimized for time-series queries and analysis.

garmindb uses a mixed approach:

  • Some monitoring tables for specific metrics.
  • Wide DailySummary table containing many aggregated metrics in a single row.
  • Less optimized for granular time-series analysis.

garmy uses normalized tables optimized for API sync:

  • daily_health_metrics: Wide table (~50 columns) for daily summaries.
  • timeseries: High-frequency data when available from API (heart rate, stress, body battery).
  • sync_status: Tracks which metrics have been synced for each date.

Update Strategy & Data Integrity

garmin-health-data uses explicit conflict resolution for idempotent reprocessing:

  • Updatable data (activities, user profile, training status): Uses ON CONFLICT UPDATE to refresh data when reprocessing.
  • Immutable time-series (heart rate, sleep movement, stress): Uses ON CONFLICT DO NOTHING to prevent duplicates.
  • FIT activity time-series: Uses ts_data_available flag check to skip reprocessing, preventing duplicate records entirely.
  • Latest flags: Manages latest=True flags for user_profile, personal_record, race_predictions to track most recent values.
  • Referential integrity: Explicit foreign key relationships with cascade deletes.
  • Fully idempotent: Safe to reprocess the same date range multiple times without creating duplicate data.

garmindb update strategy:

  • Uses SQLAlchemy session.merge() operations via insert_or_update() and s_insert_or_update() methods.
  • Handles duplicates at the ORM level rather than explicit SQL constraints.
  • Implementation detail not documented in README or schema documentation.
  • Idempotency behavior exists but is implicit rather than guaranteed at database level.

garmy update strategy:

  • Uses SQLAlchemy session.merge() for upserts + sync_status table for tracking.
  • Sync-aware: Tracks which metrics have been synced for each date to avoid redundant API calls.
  • Status tracking: Records pending, completed, failed, or skipped status per metric/date.

Contributing

Contributions are welcome! Please note:

  • Data extraction and processing logic is synchronized with the openetl Garmin pipeline
  • For changes to extraction/processing logic, please contribute to openetl first, as this application is a wrapper that provides a standalone CLI
  • For CLI-specific features, documentation, or packaging improvements, feel free to contribute directly here

Please feel free to submit a Pull Request.

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

garmin_health_data-2.0.0.tar.gz (77.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

garmin_health_data-2.0.0-py3-none-any.whl (67.1 kB view details)

Uploaded Python 3

File details

Details for the file garmin_health_data-2.0.0.tar.gz.

File metadata

  • Download URL: garmin_health_data-2.0.0.tar.gz
  • Upload date:
  • Size: 77.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for garmin_health_data-2.0.0.tar.gz
Algorithm Hash digest
SHA256 2f3b190a9cf933e9fb055f6afadcfad58484934bf65bccaac71727617c571750
MD5 f3aaf05d9d0b95cd33eedc7e25307d35
BLAKE2b-256 8b11ce5b758a7335da8d38cca4b65473f9e927f8ff3d33a7c0e9a458e5dbbce4

See more details on using hashes here.

File details

Details for the file garmin_health_data-2.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for garmin_health_data-2.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0e71c904993370240092fd9c5ae39cf0633e03b022fb1052bb9c49a282094f55
MD5 6986d157d799f55110ca9aa6825d8a15
BLAKE2b-256 ba2295f5cc36df492e7d2c66bf37caf1c0fca0be324286129a8db9514fdd974b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page