GlycoSignal

A production-quality Python package for continuous glucose monitor (CGM) data analysis and ML-ready feature extraction.

These details have not been verified by PyPI

Project links

Project description

GlycoSignal

CGM analysis for Python. Compute any glycemic metric individually, or generate ML-ready feature matrices from sliding windows.

Installation

pip install GlycoSignal

Optional extras:

pip install "GlycoSignal[dev]"     # pytest, black, ruff, mypy
pip install "GlycoSignal[report]"  # HTML reports (adds Jinja2)

Input Data Format

GlycoSignal reads CSV files. Your CSV needs at minimum a timestamp column and a glucose column.

Column	Type	Required	Description
`Timestamp`	datetime	Yes	Reading timestamp (any format `pandas` can parse)
`Glucose`	float	Yes	Glucose value in mg/dL
`subject`	string	For multi-subject files	Subject or patient identifier

Column names are auto-detected. You do not need to rename your columns before loading. GlycoSignal recognizes these common names (case-insensitive):

Timestamp: Timestamp, time, datetime, date_time, date
Glucose: Glucose, Glucose Value (mg/dL), gl, sgv, glucose_mg_dl, bg, blood_glucose
Subject: subject, id, ptid, patient_id, subjectid

If your column names are not recognized, pass them explicitly:

df = glycosignal.load_csv("data.csv", timestamp_col="time_utc", glucose_col="bg_mg_dl")

Example CSV (single subject):

Timestamp,Glucose
2024-01-15 08:00:00,123
2024-01-15 08:05:00,121
2024-01-15 08:10:00,125

Example CSV (multiple subjects in one file):

Timestamp,Glucose,subject
2024-01-15 08:00:00,123,P001
2024-01-15 08:00:00,135,P002
2024-01-15 08:05:00,121,P001
2024-01-15 08:05:00,140,P002

Working with multiple subjects

If your data has multiple subjects in one file, use load_cgm_file and specify which column identifies each subject:

from glycosignal import io

df = io.load_cgm_file("all_subjects.csv", subject_col="ptid")

If each subject is in a separate CSV file inside a folder, use load_cgm_folder. It auto-derives a subject column from each filename:

df = io.load_cgm_folder("data/subjects/")
# Adds 'subject' and 'filename' columns automatically

When building sliding windows or feature maps from multi-subject data, use group_col to process each subject independently:

from glycosignal import windows, features

result = windows.create_sliding_windows(df, window_hours=24, group_col="subject")
X = features.build_feature_map(result.windows)
# X contains rows for all subjects, with a 'subject' column preserved

Unit conversion

GlycoSignal expects glucose in mg/dL. If your data is in mmol/L, convert first:

from glycosignal import preprocessing

df = preprocessing.convert_units(df, from_unit="mmol/L", to_unit="mg/dL")

Device-specific loaders

For Dexcom and FreeStyle Libre exports (which have non-standard headers), use the dedicated loaders:

df = io.load_dexcom("dexcom_export.csv")     # skips Dexcom header row
df = io.load_libre("libre_export.csv")       # skips Libre 2-row header

What you can do in 30 seconds

import glycosignal

df = glycosignal.load_csv("examples/sample_cgm.csv")  # included in repo
df = glycosignal.clean_cgm(df)

# One metric
print(glycosignal.mean_glucose(df))        # 128.4
print(glycosignal.time_in_range_percent(df, low=70, high=180))  # 81.2

# Full feature matrix (20+ features, one row per 24h window)
result = glycosignal.create_sliding_windows(df, window_hours=24)
X = glycosignal.build_feature_map(result.windows)
print(X.shape)  # (7, 23)

Core Workflows

A. Compute glycemic metrics

Every metric is a standalone function. Call them on any cleaned DataFrame or pass a pre-prepared object for efficiency.

from glycosignal import metrics

metrics.mean_glucose(df)                           # 128.4
metrics.cv(df)                                     # 26.6 (%)
metrics.time_in_range_percent(df, low=70, high=180)  # 81.2
metrics.lbgi(df)                                   # 0.8
metrics.mage(df)                                   # 45.3
metrics.gri(df)                                    # 12.4

Group them when you need everything at once:

metrics.basic_stats(df)
# {'mean': 128.4, 'median': 126.0, 'min': 62.0, 'max': 248.0, 'q1': 102.0, 'q3': 155.0}

metrics.variability_metrics(df)
# {'sd': 34.2, 'cv': 26.6, 'j_index': 26.7, 'mage': 45.8}

metrics.risk_indices(df)
# {'lbgi': 0.8, 'hbgi': 3.2, 'adrr': 14.1, 'gri': 12.4}

metrics.summary_dict(df)   # all of the above combined

Performance tip: Call prepare() once when computing many features on the same data:

from glycosignal.schemas import prepare

p = prepare(df)
metrics.mean_glucose(p)
metrics.cv(p)
metrics.lbgi(p)

Callable metric functions

These are called directly from the metrics module and accept parameters. All accept a DataFrame or PreparedCGMData.

Feature	Description	Computation
Basic stats
`mean_glucose(data)`	Mean BGL	μ = (1/N) Σ Xᵢ
`median_glucose(data)`	Median BGL	Middle value of sorted readings
`min_glucose(data)`	Minimum BGL	Min(X₁, ..., Xₙ)
`max_glucose(data)`	Maximum BGL	Max(X₁, ..., Xₙ)
`q1_glucose(data)`	First quartile of BGL	Q1 = Percentile(X, 25)
`q3_glucose(data)`	Third quartile of BGL	Q3 = Percentile(X, 75)
Variability
`sd(data)`	Standard deviation of BGL	σ = √(Σ(Xᵢ - μ)² / N)
`cv(data)`	Coefficient of variation	CV = (σ / μ) × 100
`j_index(data)`	J-index (glycemic variability)	J = 0.001 × (μ + σ)²
`mage(data)`	Mean Amplitude of Glucose Excursions	Mean of alternating peak-nadir amplitudes exceeding σ
`conga24(data)`	Continuous Overall Net Glycemic Action	SD of {G(t) - G(t - 24h)} for all matched pairs
Time-in-range (parametric)
`time_in_range_minutes(data, low, high)`	BGL time inside [low, high]	TIR = Δt × Σ(low ≤ BGL(t) ≤ high)
`time_in_range_percent(data, low, high)`	Percent time inside [low, high]	TIR% = (TIR / T) × 100
`time_below_range_minutes(data, threshold)`	BGL time below threshold	TBR = Δt × Σ(BGL(t) ≤ threshold)
`time_below_range_percent(data, threshold)`	Percent time below threshold	TBR% = (TBR / T) × 100
`time_above_range_minutes(data, threshold)`	BGL time above threshold	TAR = Δt × Σ(BGL(t) ≥ threshold)
`time_above_range_percent(data, threshold)`	Percent time above threshold	TAR% = (TAR / T) × 100
`time_outside_range_minutes(data, low, high)`	BGL time outside [low, high]	TOR = Δt × Σ(BGL(t) < low or BGL(t) > high)
`time_outside_range_percent(data, low, high)`	Percent time outside [low, high]	TOR% = (TOR / T) × 100
Risk indices
`lbgi(data)`	Low Blood Glucose Index	LBGI = (1/N) Σ rl(Xᵢ); f(X) = ln(X)^1.084 - 5.381; rl = 22.77 × f² if f ≤ 0
`hbgi(data)`	High Blood Glucose Index	HBGI = (1/N) Σ rh(Xᵢ); rh = 22.77 × f² if f > 0
`adrr(data)`	Average Daily Risk Range	ADRR = Max(rl) + Max(rh)
`gri(data)`	Glucose Risk Index	GRI = 3.0×%TBR₅₄ + 2.4×%TBR₇₀ + 1.6×%TAR₂₅₀ + 0.8×%TAR₁₈₀, capped at 100
Excursions
`mean_glucose_excursion(data)`	Mean BGL outside mean ± SD	Mean of Xᵢ where Xᵢ < μ - σ or Xᵢ > μ + σ
`mean_glucose_normal(data)`	Mean BGL inside mean ± SD	Mean of Xᵢ where μ - σ ≤ Xᵢ ≤ μ + σ
Peak counts
`count_peaks(data, threshold)`	Episodes above threshold	Count of rising-edge crossings above threshold
`count_peaks_in_range(data, lower, upper)`	Episodes entering [lower, upper]	Count of rising-edge entries into [lower, upper]

Notation: N = readings, Xᵢ = glucose value, μ = mean, σ = SD, Δt = interval between readings, T = total monitoring time.

Registry feature names

These are the 32 fixed-parameter feature names used in build_feature_map(feature_names=[...]). They are pre-wired to specific clinical thresholds and require no parameters.

glycosignal.list_features()           # all 32 names
glycosignal.list_features(category="time_in_range")  # filtered by category

Feature name	Category	Description
`mean_glucose`	basic_stats	Mean glucose (mg/dL)
`median_glucose`	basic_stats	Median glucose (mg/dL)
`min_glucose`	basic_stats	Minimum glucose (mg/dL)
`max_glucose`	basic_stats	Maximum glucose (mg/dL)
`q1_glucose`	basic_stats	25th percentile glucose (mg/dL)
`q3_glucose`	basic_stats	75th percentile glucose (mg/dL)
`sd`	variability	Standard deviation of glucose (mg/dL)
`cv`	variability	Coefficient of variation (%)
`j_index`	variability	J-index: 0.001 × (mean + SD)²
`mage`	variability	Mean Amplitude of Glucose Excursions (mg/dL)
`conga24`	variability	SD of glucose differences 24h apart (mg/dL)
`tir_70_180_min`	time_in_range	Minutes in target range 70–180 mg/dL
`tir_70_180_pct`	time_in_range	Percent time in target range 70–180 mg/dL
`tir_70_140_min`	time_in_range	Minutes in tight range 70–140 mg/dL
`tir_70_140_pct`	time_in_range	Percent time in tight range 70–140 mg/dL
`tbr_70_min`	time_in_range	Minutes below 70 mg/dL (level 1 hypoglycemia)
`tbr_70_pct`	time_in_range	Percent time below 70 mg/dL
`tbr_54_min`	time_in_range	Minutes below 54 mg/dL (level 2 hypoglycemia)
`tbr_54_pct`	time_in_range	Percent time below 54 mg/dL
`tar_180_min`	time_in_range	Minutes above 180 mg/dL (level 1 hyperglycemia)
`tar_180_pct`	time_in_range	Percent time above 180 mg/dL
`tar_250_min`	time_in_range	Minutes above 250 mg/dL (level 2 hyperglycemia)
`tar_250_pct`	time_in_range	Percent time above 250 mg/dL
`lbgi`	risk	Low Blood Glucose Index
`hbgi`	risk	High Blood Glucose Index
`adrr`	risk	Average Daily Risk Range
`gri`	risk	Glucose Risk Index (Klonoff et al. 2023)
`mean_glucose_excursion`	excursion	Mean of readings outside mean ± 1SD (mg/dL)
`mean_glucose_normal`	excursion	Mean of readings inside mean ± 1SD (mg/dL)
`peaks_above_140`	peak	Excursion episodes above 140 mg/dL (count)
`peaks_above_180`	peak	Excursion episodes above 180 mg/dL (count)
`peaks_above_250`	peak	Severe hyperglycemic episodes above 250 mg/dL (count)

B. Build ML feature matrices

The recommended pipeline: clean data, create windows, extract features, train.

import glycosignal
from glycosignal import windows, features
from sklearn.ensemble import RandomForestClassifier

df = glycosignal.load_csv("cgm.csv")
df = glycosignal.clean_cgm(df)

result = windows.create_sliding_windows(df, window_hours=24, overlap_hours=0)
X = features.build_feature_map(result.windows)
# X has one row per window, 20+ feature columns

feature_cols = [c for c in X.columns if c not in ("window_id", "subject", "date")]
clf = RandomForestClassifier()
clf.fit(X[feature_cols], y)

Select a specific subset of features:

X = features.build_feature_map(
    result.windows,
    feature_names=["mean_glucose", "cv", "tir_70_180_pct", "mage", "lbgi"],
)

Build a feature vector for a single window:

features.build_feature_vector(window_df, feature_names=["mean_glucose", "cv"])
# {'mean_glucose': 128.4, 'cv': 26.6}

Build a table from a list of DataFrames (one per subject):

features.build_feature_table(
    [df_s01, df_s02, df_s03],
    record_ids=["S01", "S02", "S03"],
)

Conceptual Model

CSV / DataFrame
    -> io.load_csv() / load_cgm_folder()
    -> preprocessing.standardize_columns() + clean_cgm()
    -> metrics.*()              # individual features, any time
    -> windows.create_sliding_windows()
    -> features.build_feature_map()
    -> ML model

Timestamps and glucose values are the only required columns.
Preprocessing is explicit: nothing is cleaned silently.
Metrics accept a raw DataFrame or a pre-prepared object interchangeably.
Windows output long-format rows with a window_id column.
Feature maps are plain DataFrames, ready for scikit-learn or any other tool.
Registry is inspectable: every feature has a name, description, and category.

Feature System

GlycoSignal has 32 built-in features organized by category. You can call each individually, use grouped helpers, or let the registry compute them in bulk.

import glycosignal

glycosignal.list_features()
# ['adrr', 'conga24', 'cv', 'gri', 'hbgi', 'j_index', 'lbgi', 'mage', ...]

glycosignal.list_features(category="risk")
# ['adrr', 'gri', 'hbgi', 'lbgi']

glycosignal.get_feature_metadata()
# DataFrame: name | description | category | output_type

glycosignal.get_feature("gri").description
# 'Glucose Risk Index (Klonoff et al. 2023)'

To register a custom feature:

from glycosignal.registry import DEFAULT_REGISTRY

DEFAULT_REGISTRY.register(
    name="my_metric",
    func=my_function,
    description="Custom metric",
    category="variability",
)

Additional Capabilities

Data loading

Loads CSVs with automatic column detection. Supports per-subject folders, multi-subject files, Dexcom, and Libre exports.

from glycosignal import io

df = io.load_csv("data.csv")
df = io.load_csv("data.csv", timestamp_col="time_utc", glucose_col="bg_mg_dl")
df = io.load_cgm_folder("data/subjects/")     # adds filename + subject columns
df = io.load_cgm_file("all.csv", subject_col="ptid")
df = io.load_dexcom("dexcom_export.csv")
df = io.load_libre("libre_export.csv")

Preprocessing

All functions return cleaned copies. Nothing is modified in place.

from glycosignal import preprocessing

df = preprocessing.standardize_columns(df)          # rename "gl", "time", etc.
df = preprocessing.clean_cgm(df)                    # drop NaN, sort, enforce positive
report = preprocessing.validate_cgm(df)             # structured quality report
gaps = preprocessing.detect_gaps(df)                # DataFrame of gap intervals
df = preprocessing.resample_cgm(df, freq="5min")    # regular grid
df = preprocessing.interpolate_cgm(df, method="pchip", max_gap_points=12)
df = preprocessing.convert_units(df, from_unit="mmol/L", to_unit="mg/dL")

Event detection

Returns a DataFrame of episodes with start_time, end_time, duration_minutes, and event_type.

from glycosignal import detect

detect.detect_hypoglycemia(df, threshold=70, min_duration_minutes=15)
detect.detect_hyperglycemia(df, threshold=180, min_duration_minutes=15)
detect.detect_nocturnal_events(df, start_hour=0, end_hour=6)
detect.detect_postprandial_excursions(df, rise_threshold=50)

Plotting

All functions return (fig, ax) and never call plt.show().

from glycosignal import plotting

fig, ax = plotting.plot_glucose_timeseries(df, subject="P001")
fig, ax = plotting.plot_daily_overlay(df)
fig, ax = plotting.plot_agp(df)
fig, ax = plotting.plot_histogram(df)

Reporting

Generates a self-contained HTML report with summary metrics, TIR, risk indices, and embedded plots.

from glycosignal import report

report.generate_summary_report(df, output_path="cgm_report.html")

CLI

glycosignal summary data.csv
glycosignal windows data.csv --window-hours 24 --overlap-hours 0 --output windows.csv
glycosignal features windows.csv --output features.csv
glycosignal features windows.csv --features mean_glucose,cv,lbgi,gri
glycosignal report data.csv --output report.html
glycosignal list-features
glycosignal list-features --category risk

Migration from Script Version

Old script	New location	Key change
`glycosignal.py`	`glycosignal.metrics` + `glycosignal.schemas`	Lowercase names; uppercase aliases (`LBGI`, `MAGE`, `TIR`) still work
`cgm_sliding_window.py` (loaders)	`glycosignal.io`	Column renamed to `Glucose` from `Glucose Value (mg/dL)`
`cgm_sliding_window.py` (windowing)	`glycosignal.windows`	Output is long-format; use `pivot_windows_wide()` for the old format
`cgm_feature_map.py` `create_feature_map()`	`glycosignal.features.build_feature_map_wide()`	Long-format preferred via `build_feature_map()`
`cgm_feature_map.py` `FEATURES` list	`glycosignal.registry.DEFAULT_REGISTRY`	Structured registry with metadata

Metric calls are backward-compatible:

# Old
import glycosignal as gs
gs.mean_glucose(df)

# New (identical result)
import glycosignal
glycosignal.mean_glucose(df)

Feature map migration:

# Old
from cgm_feature_map import create_feature_map
X = create_feature_map(windows_df)

# New (wide-format still works)
from glycosignal.features import build_feature_map_wide
X = build_feature_map_wide(windows_df)

# New preferred (long-format pipeline)
from glycosignal import windows, features
result = windows.create_sliding_windows(df)
X = features.build_feature_map(result.windows)

Folder loading migration:

# Old
from cgm_sliding_window import load_cgm_folder
df = load_cgm_folder("Data/Processed")

# New
from glycosignal import io, preprocessing
df = io.load_cgm_folder("Data/Processed")
df = preprocessing.standardize_columns(df)
df = preprocessing.clean_cgm(df)

Development

git clone https://github.com/glycosignal/glycosignal
cd glycosignal
pip install -e ".[dev]"
pytest

License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Mar 31, 2026

0.1.3

Mar 27, 2026

0.1.2

Mar 27, 2026

0.1.1

Mar 27, 2026

This version

0.1.0

Mar 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glycosignal-0.1.0.tar.gz (83.6 kB view details)

Uploaded Mar 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

glycosignal-0.1.0-py3-none-any.whl (55.1 kB view details)

Uploaded Mar 27, 2026 Python 3

File details

Details for the file glycosignal-0.1.0.tar.gz.

File metadata

Download URL: glycosignal-0.1.0.tar.gz
Upload date: Mar 27, 2026
Size: 83.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for glycosignal-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`a1864356e37dbaa664e0c6df059270b04d7f5d56ea9fc806ca24df56cbc1268a`
MD5	`a0ded38b789c0a1e3938fff02aa61b75`
BLAKE2b-256	`bc17cab187e3bae32801ebb012b6f60d52c7daddb276add1511ebd632a86d9fa`

See more details on using hashes here.

File details

Details for the file glycosignal-0.1.0-py3-none-any.whl.

File metadata

Download URL: glycosignal-0.1.0-py3-none-any.whl
Upload date: Mar 27, 2026
Size: 55.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for glycosignal-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b1115401a576cbe152edca7beb42ae9a1e759e825c742e8fa43bdce34784df06`
MD5	`ab32e02e1f038d139dcfb6ebe0e89807`
BLAKE2b-256	`d10c9c119541a2bf7130a33c333e8503f5dea9c737dbf5e665b240c83b97eeb8`

See more details on using hashes here.

GlycoSignal 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

GlycoSignal

Installation

Input Data Format

Working with multiple subjects

Unit conversion

Device-specific loaders

What you can do in 30 seconds

Core Workflows

A. Compute glycemic metrics

Callable metric functions

Registry feature names

B. Build ML feature matrices

Conceptual Model

Feature System

Additional Capabilities

Data loading

Preprocessing

Event detection

Plotting

Reporting

CLI

Migration from Script Version

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes