Skip to main content

Python toolkit for CGM analysis with individually callable glycemic metrics and ML-ready feature extraction.

Project description

GlycoSignal

PyPI version Python versions Tests License: MIT

GlycoSignal

Analyze continuous glucose monitor (CGM) data in Python with individually callable glycemic metrics, sliding-window pipelines, and ML-ready feature matrices.


Table of Contents


Installation

pip install GlycoSignal

Optional extras:

pip install "GlycoSignal[report]"  # HTML reports (adds Jinja2)

Input Data Format

GlycoSignal reads CSV files. The only required columns are a timestamp and a glucose value.

Column Type Required Description
Timestamp datetime Yes Reading timestamp (any format pandas can parse)
Glucose float Yes Glucose value in mg/dL
subject string Multi-subject only Subject or patient identifier

Column names are auto-detected (case-insensitive). Recognized alternatives:

  • Timestamp: Timestamp, time, datetime, date_time, date
  • Glucose: Glucose, Glucose Value (mg/dL), gl, sgv, glucose_mg_dl, bg, blood_glucose
  • Subject: subject, id, ptid, patient_id, subjectid

If your column names are not recognized, pass them explicitly:

df = glycosignal.load_csv("data.csv", timestamp_col="time_utc", glucose_col="bg_mg_dl")

Single subject

Timestamp,Glucose
2024-01-15 08:00:00,123
2024-01-15 08:05:00,121
2024-01-15 08:10:00,125

Multiple subjects

One file with a subject column:

Timestamp,Glucose,subject
2024-01-15 08:00:00,123,P001
2024-01-15 08:00:00,135,P002
2024-01-15 08:05:00,121,P001
from glycosignal import io

df = io.load_cgm_file("all_subjects.csv", subject_col="ptid")

One CSV per subject in a folder (subject column derived from filename):

df = io.load_cgm_folder("data/subjects/")

When windowing or building feature maps from multi-subject data, pass group_col:

from glycosignal import windows, features

result = windows.create_sliding_windows(df, window_hours=24, group_col="subject")
X = features.build_feature_map(result.windows)

Unit conversion

GlycoSignal expects glucose in mg/dL. Convert mmol/L first:

from glycosignal import preprocessing

df = preprocessing.convert_units(df, from_unit="mmol/L", to_unit="mg/dL")

Device-specific loaders

df = io.load_dexcom("dexcom_export.csv")   # skips Dexcom header row
df = io.load_libre("libre_export.csv")     # skips Libre 2-row header

Quickstart

import glycosignal

df = glycosignal.load_csv("examples/sample_cgm.csv")  # sample file included in repo
df = glycosignal.clean_cgm(df)

# One metric
print(glycosignal.mean_glucose(df))                          # 138.5
print(glycosignal.time_in_range_percent(df, low=70, high=180))  # 93.1

# Full feature matrix (32 features, one row per 24h window)
result = glycosignal.create_sliding_windows(df, window_hours=24)
X = glycosignal.build_feature_map(result.windows)
print(X.shape)

Computing Glycemic Metrics

Individual metrics

Every metric is a standalone function. Call directly on any cleaned DataFrame:

from glycosignal import metrics

metrics.mean_glucose(df)                             # 138.5
metrics.cv(df)                                       # 17.7 (%)
metrics.time_in_range_percent(df, low=70, high=180)  # 93.1
metrics.lbgi(df)                                     # 0.01
metrics.mage(df)                                     # 27.0
metrics.gri(df)                                      # 7.2

Grouped summaries

metrics.basic_stats(df)
# {'mean': 138.5, 'median': 132.5, 'min': 102.0, 'max': 193.0, 'q1': 117.0, 'q3': 158.25}

metrics.variability_metrics(df)
# {'sd': 24.5, 'cv': 17.7, 'j_index': 26.6, 'mage': 27.0}

metrics.risk_indices(df)
# {'lbgi': 0.01, 'hbgi': 2.3, 'adrr': 10.5, 'gri': 7.2}

metrics.summary_dict(df)    # all of the above in one dict

Performance tip: Call prepare() once when computing many metrics on the same data:

from glycosignal.schemas import prepare

p = prepare(df)
metrics.mean_glucose(p)
metrics.cv(p)
metrics.lbgi(p)

Full metric reference

Callable metric functions

All functions accept a DataFrame or PreparedCGMData object.

Feature Description Computation
Basic stats
mean_glucose(data) Mean BGL μ = (1/N) Σ Xᵢ
median_glucose(data) Median BGL Middle value of sorted readings
min_glucose(data) Minimum BGL Min(X₁, ..., Xₙ)
max_glucose(data) Maximum BGL Max(X₁, ..., Xₙ)
q1_glucose(data) First quartile of BGL Q1 = Percentile(X, 25)
q3_glucose(data) Third quartile of BGL Q3 = Percentile(X, 75)
Variability
sd(data) Standard deviation of BGL σ = √(Σ(Xᵢ - μ)² / N)
cv(data) Coefficient of variation CV = (σ / μ) × 100
j_index(data) J-index J = 0.001 × (μ + σ)²
mage(data) Mean Amplitude of Glucose Excursions Mean of alternating peak-nadir amplitudes exceeding σ
conga24(data) Continuous Overall Net Glycemic Action SD of {G(t) - G(t - 24h)} for all matched pairs
Time-in-range
time_in_range_minutes(data, low, high) Minutes inside [low, high] TIR = Δt × Σ(low ≤ BGL(t) ≤ high)
time_in_range_percent(data, low, high) Percent time inside [low, high] TIR% = (TIR / T) × 100
time_below_range_minutes(data, threshold) Minutes below threshold TBR = Δt × Σ(BGL(t) ≤ threshold)
time_below_range_percent(data, threshold) Percent time below threshold TBR% = (TBR / T) × 100
time_above_range_minutes(data, threshold) Minutes above threshold TAR = Δt × Σ(BGL(t) ≥ threshold)
time_above_range_percent(data, threshold) Percent time above threshold TAR% = (TAR / T) × 100
time_outside_range_minutes(data, low, high) Minutes outside [low, high] TOR = Δt × Σ(BGL < low or BGL > high)
time_outside_range_percent(data, low, high) Percent time outside [low, high] TOR% = (TOR / T) × 100
Risk indices
lbgi(data) Low Blood Glucose Index LBGI = (1/N) Σ rl(Xᵢ); f(X) = ln(X)^1.084 - 5.381; rl = 22.77 × f² if f ≤ 0
hbgi(data) High Blood Glucose Index HBGI = (1/N) Σ rh(Xᵢ); rh = 22.77 × f² if f > 0
adrr(data) Average Daily Risk Range ADRR = Max(rl) + Max(rh)
gri(data) Glucose Risk Index GRI = 3.0×%TBR₅₄ + 2.4×%TBR₇₀ + 1.6×%TAR₂₅₀ + 0.8×%TAR₁₈₀, capped at 100
Excursions
mean_glucose_excursion(data) Mean BGL outside mean ± SD Mean of Xᵢ where Xᵢ < μ - σ or Xᵢ > μ + σ
mean_glucose_normal(data) Mean BGL inside mean ± SD Mean of Xᵢ where μ - σ ≤ Xᵢ ≤ μ + σ
Peak counts
count_peaks(data, threshold) Episodes above threshold Count of rising-edge crossings above threshold
count_peaks_in_range(data, lower, upper) Episodes entering [lower, upper] Count of rising-edge entries into [lower, upper]

N = readings, Xᵢ = glucose value, μ = mean, σ = SD, Δt = interval between readings, T = total monitoring time.


Building ML Feature Matrices

The full pipeline: load, clean, window, extract features, train.

import glycosignal
from glycosignal import windows, features
from sklearn.ensemble import RandomForestClassifier

df = glycosignal.load_csv("cgm.csv")
df = glycosignal.clean_cgm(df)

result = windows.create_sliding_windows(df, window_hours=24, overlap_hours=0)
X = features.build_feature_map(result.windows)

feature_cols = [c for c in X.columns if c not in ("window_id", "subject", "date")]
clf = RandomForestClassifier()
clf.fit(X[feature_cols], y)

Select specific features:

X = features.build_feature_map(
    result.windows,
    feature_names=["mean_glucose", "cv", "tir_70_180_pct", "mage", "lbgi"],
)

Feature vector for a single window:

features.build_feature_vector(window_df, feature_names=["mean_glucose", "cv"])
# {'mean_glucose': 138.5, 'cv': 17.7}

Feature table from a list of DataFrames (one per subject):

features.build_feature_table(
    [df_s01, df_s02, df_s03],
    record_ids=["S01", "S02", "S03"],
)

Feature registry

GlycoSignal has 32 built-in features organized by category, pre-wired to standard clinical thresholds.

glycosignal.list_features()                         # all 32 names
glycosignal.list_features(category="risk")          # ['adrr', 'gri', 'hbgi', 'lbgi']
glycosignal.get_feature_metadata()                  # DataFrame: name | description | category
glycosignal.get_feature("gri").description          # 'Glucose Risk Index (Klonoff et al. 2023)'
Feature name Category Description
mean_glucose basic_stats Mean glucose (mg/dL)
median_glucose basic_stats Median glucose (mg/dL)
min_glucose basic_stats Minimum glucose (mg/dL)
max_glucose basic_stats Maximum glucose (mg/dL)
q1_glucose basic_stats 25th percentile glucose (mg/dL)
q3_glucose basic_stats 75th percentile glucose (mg/dL)
sd variability Standard deviation (mg/dL)
cv variability Coefficient of variation (%)
j_index variability J-index: 0.001 × (mean + SD)²
mage variability Mean Amplitude of Glucose Excursions (mg/dL)
conga24 variability SD of glucose differences 24h apart (mg/dL)
tir_70_180_min time_in_range Minutes in target range 70–180 mg/dL
tir_70_180_pct time_in_range Percent time in target range 70–180 mg/dL
tir_70_140_min time_in_range Minutes in tight range 70–140 mg/dL
tir_70_140_pct time_in_range Percent time in tight range 70–140 mg/dL
tbr_70_min time_in_range Minutes below 70 mg/dL (level 1 hypoglycemia)
tbr_70_pct time_in_range Percent time below 70 mg/dL
tbr_54_min time_in_range Minutes below 54 mg/dL (level 2 hypoglycemia)
tbr_54_pct time_in_range Percent time below 54 mg/dL
tar_180_min time_in_range Minutes above 180 mg/dL (level 1 hyperglycemia)
tar_180_pct time_in_range Percent time above 180 mg/dL
tar_250_min time_in_range Minutes above 250 mg/dL (level 2 hyperglycemia)
tar_250_pct time_in_range Percent time above 250 mg/dL
lbgi risk Low Blood Glucose Index
hbgi risk High Blood Glucose Index
adrr risk Average Daily Risk Range
gri risk Glucose Risk Index (Klonoff et al. 2023)
mean_glucose_excursion excursion Mean of readings outside mean ± 1SD (mg/dL)
mean_glucose_normal excursion Mean of readings inside mean ± 1SD (mg/dL)
peaks_above_140 peak Episodes above 140 mg/dL (count)
peaks_above_180 peak Episodes above 180 mg/dL (count)
peaks_above_250 peak Episodes above 250 mg/dL (count)

Add a custom feature:

from glycosignal.registry import DEFAULT_REGISTRY

DEFAULT_REGISTRY.register(
    name="my_metric",
    func=my_function,
    description="Custom metric",
    category="variability",
)

Additional Capabilities

Preprocessing

All functions return cleaned copies. Nothing is modified in place.

from glycosignal import preprocessing

df = preprocessing.clean_cgm(df)                    # drop NaN, sort, enforce positive
report = preprocessing.validate_cgm(df)             # structured quality report
gaps = preprocessing.detect_gaps(df)                # DataFrame of gap intervals
df = preprocessing.resample_cgm(df, freq="5min")    # regular grid
df = preprocessing.interpolate_cgm(df, method="pchip", max_gap_points=12)

Event detection

Returns a DataFrame with start_time, end_time, duration_minutes, and event_type.

from glycosignal import detect

detect.detect_hypoglycemia(df, threshold=70, min_duration_minutes=15)
detect.detect_hyperglycemia(df, threshold=180, min_duration_minutes=15)
detect.detect_nocturnal_events(df, start_hour=0, end_hour=6)
detect.detect_postprandial_excursions(df, rise_threshold=50)

Plotting

All functions return (fig, ax) and never call plt.show().

from glycosignal import plotting

fig, ax = plotting.plot_glucose_timeseries(df, subject="P001")
fig, ax = plotting.plot_daily_overlay(df)
fig, ax = plotting.plot_agp(df)
fig, ax = plotting.plot_histogram(df)
fig.savefig("output.png", dpi=150)

Reporting

Generates a self-contained HTML report with summary metrics, TIR, risk indices, and embedded plots.

from glycosignal import report

report.generate_summary_report(df, output_path="cgm_report.html")

Command-line interface

After installation, the glycosignal command is available from any terminal:

glycosignal summary data.csv
glycosignal windows data.csv --window-hours 24 --overlap-hours 0 --output windows.csv
glycosignal features windows.csv --output features.csv
glycosignal features windows.csv --features mean_glucose,cv,lbgi,gri
glycosignal report data.csv --output report.html
glycosignal list-features
glycosignal list-features --category risk

License

MIT. Copyright (c) 2024 Jiafeng Song. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glycosignal-0.1.2.tar.gz (81.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

glycosignal-0.1.2-py3-none-any.whl (54.0 kB view details)

Uploaded Python 3

File details

Details for the file glycosignal-0.1.2.tar.gz.

File metadata

  • Download URL: glycosignal-0.1.2.tar.gz
  • Upload date:
  • Size: 81.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for glycosignal-0.1.2.tar.gz
Algorithm Hash digest
SHA256 b32bc97b2f556d2995bb5e3cfc12326e5ac7fe8f31dd155da5310c87817a74ef
MD5 7779d959b64ca39676f332116770e587
BLAKE2b-256 0e4e646998feaa8b142795970fc0b1dcd1b221ec9983c1e263be0e0baf659a8a

See more details on using hashes here.

File details

Details for the file glycosignal-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: glycosignal-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 54.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for glycosignal-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5a892efba4810610d927fa169317cc5b5d74beee71a6021e1f03aca25a647753
MD5 dc48e8fc81145bb058c9d3b78fb4dde5
BLAKE2b-256 6c6da197ed4b3a23dee33f9531bd223a08e5c24160ddc5c0f1f2b2c093f7c000

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page