Python SDK for the CHARM time-series foundation model — embeddings, forecasting, and a downstream-task toolkit.

These details have not been verified by PyPI

Project links

Project description

c3-charm

A Python SDK for the CHARM time-series foundation model. Provides embeddings (multivariate time series → vectors), forecast/backcast (quantile predictions), and a toolkit for downstream tasks (anomaly detection, retrieval, classification, reconstruction, forecasting).

What is CHARM?

CHARM (CHannel Aware Representation Model) is a foundation model for multivariate time series. It ingests windows of (T, C) data — T timesteps, C channels — and produces dense embeddings that capture temporal patterns and cross-channel relationships. Channel names (descriptions) are part of the input, making the model channel-aware.

No scaling required — the model handles normalization internally. Send raw data directly.

Installation

pip install c3-charm            # core SDK only (embeddings + forecast)
pip install c3-charm[toolkit]   # includes PyTorch models, datasets, trainers

Or from source:

git clone https://github.com/c3ai/c3-charm.git
cd c3-charm
poetry install                    # core SDK only
poetry install --with toolkit     # include toolkit dependencies

Core SDK

Client initialization

from charm import CharmClient

client = CharmClient(
    base_url="http://your-server:8080",
    api_key="your-api-key",      # or set CHARM_API_KEY env var
    timeout=300,
    max_retries=3,
)

Embeddings — `client.embeddings.create()`

Converts time series windows into dense vectors.

response = client.embeddings.create(
    descriptions=[["sensor_A", "sensor_B"]],  # (N, C) channel names
    ts_array=[[[1.0, 2.0], [1.1, 2.1], ...]],  # (N, T, C) values
    batch_size=32,
    return_tensors="np",       # "list", "np", or "torch"
    aggregate=True,            # True → (N, D); False → (N, T_, C, D)
    progress=True,
)
embeddings = response.embeds  # shape (N, D) when aggregate=True

aggregate parameter:

True (default): Returns flattened embeddings (N, D) — one vector per series. Best for retrieval, classification, clustering.
False: Returns per-patch, per-channel embeddings (N, T_, C, D) where T_ = T / patch_size. Best for fine-grained tasks or custom heads.

Async (faster for large datasets):

response = await client.embeddings.async_create(
    descriptions=descriptions,
    ts_array=ts_array,
    max_B_per_request=32,
    concurrency_per_call=8,
    return_tensors="np",
    aggregate=True,
)

Forecast / Backcast — `client.prediction.create()`

Zero-shot quantile predictions — no training required.

response = client.prediction.create(
    descriptions=[["sensor_A", "sensor_B"]],
    ts_array=[[[1.0, 2.0], [1.1, 2.1], ...]],
    target_len=10,       # positive = forecast, negative = backcast
    return_tensors="np",
)
forecast = response.denormalized_predictions  # (N, 10, C, Q) — Q quantiles
median = response.median                      # (N, 10, C) — point forecast

Backcast (reconstruct past values):

response = client.prediction.create(
    descriptions=descriptions,
    ts_array=ts_array,
    target_len=-8,  # reconstruct last 8 steps
    return_tensors="np",
)

Input constraints

Constraint	Limit
Timesteps per series	1 ≤ T < 1500
Channels per series	C < 1500
Per-request size	N × C × T ≤ 500,000
Batch consistency	All series in a request must share the same T and C
Minimum for good embeddings	T ≥ 32 (model patch size)

The SDK handles client-side batching automatically when you set batch_size (sync) or max_B_per_request (async).

Output shapes

Method	Output field	Shape
`embeddings.create(aggregate=True)`	`response.embeds`	(N, D)
`embeddings.create(aggregate=False)`	`response.embeds`	(N, T_, C, D)
`prediction.create(target_len > 0)`	`response.denormalized_predictions`	(N, target_len, C, Q)
`prediction.create(target_len < 0)`	`response.denormalized_predictions`	(N, abs(target_len), C, Q)
`prediction.create(...)`	`response.median`	(N, abs(target_len), C)

Channel descriptions

Descriptions are required and affect embedding quality. They tell the model what each channel represents.

Good descriptions — use meaningful, consistent names:

descriptions = [["engine_temperature", "oil_pressure", "rpm"]]

Acceptable — short but informative:

descriptions = [["temp", "pressure", "speed"]]

Avoid — generic or positional names reduce model effectiveness:

descriptions = [["col_0", "col_1", "col_2"]]  # works but suboptimal

When working with pandas DataFrames, use column names directly:

descriptions = [df.columns.tolist()] * N

Scaling

No pre-processing needed. CHARM normalizes internally. Send raw data as-is. Do not apply StandardScaler, MinMaxScaler, or log transforms before calling the API.

Error handling

from charm import CharmError, AuthenticationError, InvalidRequestError, RateLimitError

try:
    response = client.embeddings.create(...)
except AuthenticationError:
    # bad API key
except InvalidRequestError as e:
    # shape violations, empty input
except RateLimitError:
    # back off and retry
except CharmError as e:
    # catch-all for other SDK errors

Toolkit — Downstream Tasks

The toolkit (pip install c3-charm[toolkit]) provides PyTorch models, dataset utilities, and training infrastructure for fine-tuning on top of CHARM embeddings.

Retrieval — `charm_toolkit.retrieval`

Find similar time series by embedding similarity.

from charm_toolkit.retrieval import (
    l2_normalize,
    cosine_similarity_matrix,
    knn_search,
    retrieval_metrics,
)

# Embed your data
response = client.embeddings.create(
    descriptions=descriptions,
    ts_array=windows_list,
    return_tensors="np",
)
embeddings = response.embeds  # (N, D)

# Similarity search
sim = cosine_similarity_matrix(embeddings, embeddings)

# kNN search
indices, scores = knn_search(query_emb, corpus_emb, k=5)

# Evaluation metrics
metrics = retrieval_metrics(
    query_emb=query_emb,
    corpus_emb=corpus_emb,
    query_labels=query_labels,
    corpus_labels=corpus_labels,
    k_values=[1, 3, 5, 10],
    exclude_self=True,
    query_ids=query_dataset_names,
    corpus_ids=corpus_dataset_names,
)
# Returns: precision@k, ndcg@k, hit_rate@k

Anomaly Detection — `charm_toolkit.anomaly_detection`

Detect anomalies via kNN distance scoring on windowed CHARM embeddings.

from charm_toolkit.anomaly_detection import (
    sliding_window_embeddings,
    knn_anomaly_scores,
    window_scores_to_pointwise,
)

# 1. Embed sliding windows
train_emb = sliding_window_embeddings(
    client, train_data, descriptions,
    window_size=128, stride=1, batch_size=64,
)
test_emb = sliding_window_embeddings(
    client, test_data, descriptions,
    window_size=128, stride=1, batch_size=64,
)

# 2. Score test windows by distance to train
window_scores = knn_anomaly_scores(
    test_emb=test_emb,
    reference_emb=train_emb,
    k=5,
    distance="cosine",    # "cosine", "l2", "l1"
    aggregation="mean",   # "mean", "max"
)

# 3. Aggregate to per-timestep scores
pointwise_scores = window_scores_to_pointwise(
    window_scores=window_scores,
    window_size=128,
    stride=1,
    total_length=len(test_data),
    method="mean",  # "mean", "max", "last", "center"
)

Pointwise aggregation methods:

Each timestep is covered by multiple overlapping windows. The method parameter controls how to assign a single score per timestep:

Method	Behavior	Use case
`"mean"`	Average of all windows covering the point	Smooth, best for offline evaluation
`"max"`	Max score among covering windows	Conservative, catches isolated spikes
`"last"`	Score of the most recently completed window	Online/streaming — score only updates when a window finishes processing
`"center"`	Score of the window centered on each point	Minimal time-shift, tightest temporal alignment

ReconstructionModel — anomaly detection via learned head

from charm_toolkit import (
    ReconstructionModel, create_reconstruction_datasets,
    collator, TrainerClass,
)
from torch.utils.data import DataLoader
import torch.nn as nn

train_ds, val_ds, test_ds = create_reconstruction_datasets(
    raw_data,           # (T, C) numpy array or torch tensor
    descriptions=channel_names,
    window_size=256,
    stride=1,
    train_ratio=0.7,
    val_ratio=0.15,
    sequential=True,
    scale=True,
)

model = ReconstructionModel(
    embedding_client=client,
    reconstructor="linear",  # "linear", "mlp", or custom nn.Module
    hidden_dim=128,
    dropout=0.1,
)

trainer = TrainerClass(
    model=model,
    train_loader=DataLoader(train_ds, batch_size=512, collate_fn=collator),
    val_loader=DataLoader(val_ds, batch_size=512, collate_fn=collator),
    epochs=1000,
    patience=5,
    lr=1e-3,
    criterion=nn.HuberLoss(),
)
trainer.fit()

ForecastingModel — embedding-based forecasting

from charm_toolkit import ForecastingModel, create_forecasting_datasets, collator, TrainerClass
from torch.utils.data import DataLoader

train_ds, val_ds, test_ds = create_forecasting_datasets(
    raw_data,
    descriptions=channel_names,
    train_horizon=96,
    test_horizon=96,
    train_ratio=0.7,
    val_ratio=0.15,
    sequential=True,
    scale=True,
)

model = ForecastingModel(
    embedding_client=client,
    horizon=96,
    input_size=96,
    head="linear",
    hidden_dim=128,
    mode="last",         # "last", "avg", "none"
    per_channel=True,
    num_channels=len(channel_names),
)

trainer = TrainerClass(
    model=model,
    train_loader=DataLoader(train_ds, batch_size=512, collate_fn=collator),
    val_loader=DataLoader(val_ds, batch_size=512, collate_fn=collator),
    epochs=1000,
    patience=10,
    lr=1e-2,
)
trainer.fit()

ClassificationModel — time series classification

from charm_toolkit import ClassificationModel, create_classification_datasets, collator, TrainerClass
from torch.utils.data import DataLoader
import torch.nn as nn

train_ds, val_ds, test_ds = create_classification_datasets(
    raw_data,          # (N, T, C)
    labels=labels,     # list of N integer labels
    descriptions=channel_names,
    train_ratio=0.7,
    val_ratio=0.15,
)

model = ClassificationModel(
    embedding_client=client,
    num_classes=num_classes,
    hidden_dim=128,
    pooling_over_t="mean",
    pooling_over_channels="mean",
    classifier_type="mlp",
)

trainer = TrainerClass(
    model=model,
    train_loader=DataLoader(train_ds, batch_size=32, collate_fn=collator),
    val_loader=DataLoader(val_ds, batch_size=32, collate_fn=collator),
    epochs=100,
    patience=10,
    lr=1e-3,
    criterion=nn.CrossEntropyLoss(),
)
trainer.fit()

Precomputing embeddings (critical for training)

Toolkit models call the API every forward pass. For training with hundreds of windows per epoch, precompute embeddings once:

from charm_toolkit import precompute_dataset_embeddings, PrecomputedEmbeddingsDataset

# Compute once, save to disk as memmap
train_shape = precompute_dataset_embeddings(
    client=client, dataset=train_ds,
    output_path="./outputs/train_embeddings.pt", memory_batch_size=8192
)
val_shape = precompute_dataset_embeddings(
    client=client, dataset=val_ds,
    output_path="./outputs/val_embeddings.pt", memory_batch_size=8192
)

# Wrap datasets — model skips API calls when "embeds" key present
train_ds = PrecomputedEmbeddingsDataset(train_ds, "./outputs/train_embeddings.pt", train_shape)
val_ds = PrecomputedEmbeddingsDataset(val_ds, "./outputs/val_embeddings.pt", val_shape)

# Training now uses cached embeddings — orders of magnitude faster
train_loader = DataLoader(train_ds, batch_size=512, shuffle=True, collate_fn=collator)

Trainer API

from charm_toolkit import TrainerClass

trainer = TrainerClass(
    model=model,
    train_loader=train_loader,
    val_loader=val_loader,
    test_loader=test_loader,     # optional
    lr=1e-3,
    weight_decay=1e-4,
    epochs=1000,
    patience=5,
    min_delta=1e-4,
    max_grad_norm=5.0,
    criterion=None,              # defaults to MSELoss
)
trainer.fit()
test_loss = trainer.evaluate(test_loader)

Dataset factory functions

All return (train_dataset, val_dataset, test_dataset):

Function	Input shape	Key args
`create_reconstruction_datasets(raw_data, ...)`	(T, C)	`window_size`, `stride`, `train_ratio`, `val_ratio`
`create_forecasting_datasets(raw_data, ...)`	(T, C)	`train_horizon`, `test_horizon`, `stride`, `train_ratio`, `val_ratio`
`create_classification_datasets(raw_data, labels, ...)`	(N, T, C)	`train_ratio`, `val_ratio`

Reconstruction and forecasting expect a single long time series (T, C) split temporally. Classification expects pre-windowed (N, T, C).

collator

All DataLoaders using toolkit datasets require collator as the collate_fn:

from charm_toolkit import collator
# or equivalently:
from charm_toolkit.Datasets import collator

Embeddings as features

CHARM embeddings work as drop-in feature vectors for any sklearn model:

import numpy as np
from sklearn.ensemble import IsolationForest
from sklearn.linear_model import LogisticRegression
from charm_toolkit.retrieval import cosine_similarity_matrix

response = client.embeddings.create(
    descriptions=descriptions,
    ts_array=windows_list,
    return_tensors="np",
)
X = response.embeds  # (N, D)

# Anomaly detection with isolation forest
clf = IsolationForest(contamination=0.05)
anomaly_labels = clf.fit_predict(X)

# Similarity search
sim = cosine_similarity_matrix(X, X)

# As features for any classifier
clf = LogisticRegression().fit(X_train, y_train)

Local Deployment

Deploy models locally from GitHub releases — no remote server needed:

with CharmClient(tag="experiment-2026-03-15_10-30-00") as client:
    response = client.embeddings.create(...)
# Server shuts down automatically

When tag is provided:

Checks for GPU availability (falls back to CPU)
Clones repo at the specified tag (shallow clone)
Downloads model weights from the GitHub release
Launches the serving stack locally
Polls health endpoint until ready

Files cached at ~/.charm/models/<tag>/ for fast subsequent runs.

CharmClient(
    tag="experiment-tag",           # required for local mode
    repo_url="https://...",         # default: c3-e/research
    cache_dir="/path/to/cache",     # default: ~/.charm/models
    port=8080,                      # 0 = auto-select
)

Decision guide

When to use CHARM

Multivariate time series (multiple channels measured over time)
Each window has at least ~32 timesteps (model patch size)
You want a strong starting point without feature engineering

When to use classical methods instead

Tabular data without a time dimension — use LightGBM, XGBoost
Very short series (< 10 timesteps)
Single scalar features — still works but may not outperform ARIMA/ETS

Zero-shot vs fine-tuned

Approach	When	Effort
`prediction.create(target_len=H)`	Quick forecast baseline, no labeled data	None — one API call
Embeddings + sklearn	Moderate data, combine with other features	Minutes
Embeddings + kNN (retrieval/AD)	Unlabeled anomaly detection or search	Minutes
Toolkit model (Reconstruction/Forecasting/Classification)	Have labeled data, want best performance	Train a small head (~minutes on CPU)

Testing

pip install pytest
python -m pytest tests/
python -m pytest tests/test_utils.py -v

Documentation

The full API reference and usage guide is this README — it renders on the PyPI page.

License

Apache License 2.0 — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.3

May 12, 2026

0.1.2

May 5, 2026

0.1.1

May 1, 2026

0.1.0

Mar 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

c3_charm-0.1.3.tar.gz (3.3 MB view details)

Uploaded May 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

c3_charm-0.1.3-py3-none-any.whl (3.3 MB view details)

Uploaded May 12, 2026 Python 3

File details

Details for the file c3_charm-0.1.3.tar.gz.

File metadata

Download URL: c3_charm-0.1.3.tar.gz
Upload date: May 12, 2026
Size: 3.3 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for c3_charm-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`fcd4749d463fe13a75ed1dae00a9f2a463dcda5453fa9e46863af20ac163e16f`
MD5	`2020a8ad4ea047d5fb857187b3459f59`
BLAKE2b-256	`b9f8f9158fab0b968aac30e235c4af1653141fda9c2caa992c1f9f6d3fb86ca4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for c3_charm-0.1.3.tar.gz:

Publisher: charm-publish.yml on c3-e/research

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: c3_charm-0.1.3.tar.gz
- Subject digest: fcd4749d463fe13a75ed1dae00a9f2a463dcda5453fa9e46863af20ac163e16f
- Sigstore transparency entry: 1520876273
- Sigstore integration time: May 12, 2026
Source repository:
- Permalink: c3-e/research@3a4d8e1e50810062d9f7a17544ea1a7b119b9656
- Branch / Tag: refs/heads/develop
- Owner: https://github.com/c3-e
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: charm-publish.yml@3a4d8e1e50810062d9f7a17544ea1a7b119b9656
- Trigger Event: push

File details

Details for the file c3_charm-0.1.3-py3-none-any.whl.

File metadata

Download URL: c3_charm-0.1.3-py3-none-any.whl
Upload date: May 12, 2026
Size: 3.3 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for c3_charm-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0963051ea73434350e6cb51aed5af55b37c253bcd8e7891436728fb6ad50cf2e`
MD5	`b732e83e5829ee4e8445cd3f872f44ef`
BLAKE2b-256	`7ba6b9f9d4c44edf893a2d337e4b1a8eaca0dd664d4ab51579dd2569216a8350`

See more details on using hashes here.

Provenance

The following attestation bundles were made for c3_charm-0.1.3-py3-none-any.whl:

Publisher: charm-publish.yml on c3-e/research

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: c3_charm-0.1.3-py3-none-any.whl
- Subject digest: 0963051ea73434350e6cb51aed5af55b37c253bcd8e7891436728fb6ad50cf2e
- Sigstore transparency entry: 1520876275
- Sigstore integration time: May 12, 2026
Source repository:
- Permalink: c3-e/research@3a4d8e1e50810062d9f7a17544ea1a7b119b9656
- Branch / Tag: refs/heads/develop
- Owner: https://github.com/c3-e
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: charm-publish.yml@3a4d8e1e50810062d9f7a17544ea1a7b119b9656
- Trigger Event: push

c3-charm 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

c3-charm

What is CHARM?

Installation

Core SDK

Client initialization

Embeddings — client.embeddings.create()

Forecast / Backcast — client.prediction.create()

Input constraints

Output shapes

Channel descriptions

Scaling

Error handling

Toolkit — Downstream Tasks

Retrieval — charm_toolkit.retrieval

Anomaly Detection — charm_toolkit.anomaly_detection

ReconstructionModel — anomaly detection via learned head

ForecastingModel — embedding-based forecasting

ClassificationModel — time series classification

Precomputing embeddings (critical for training)

Trainer API

Dataset factory functions

collator

Embeddings as features

Local Deployment

Decision guide

When to use CHARM

When to use classical methods instead

Zero-shot vs fine-tuned

Testing

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Embeddings — `client.embeddings.create()`

Forecast / Backcast — `client.prediction.create()`

Retrieval — `charm_toolkit.retrieval`

Anomaly Detection — `charm_toolkit.anomaly_detection`