Python SDK for the CHARM time-series foundation model — embeddings, forecasting, and a downstream-task toolkit.
Project description
c3-charm
A Python SDK for interacting with the CHARM time-series API. It provides a simple interface for embeddings (multivariate time series → vectors) and forecast/backcast (predict future or reconstruct past steps).
What is CHARM?
CHARM (CHannel Aware Representation Model) is a foundation model specifically designed for multi-variate time series data. It generates high-quality embeddings that capture the semantic essence of time series segments, making them ideal for various downstream applications:
- Anomaly detection: Identify unusual patterns in time series data
- Clustering: Group similar time series together
- Classification: Categorize time series into predefined classes
- Forecasting: Improve time series predictions
- Similarity search: Find similar patterns across large datasets
Data shapes and API reference
Use this section to shape your inputs and interpret outputs. The API has two endpoints; both expect the same time series format.
Shared input format (embeddings and forecast/backcast)
Every request uses:
-
descriptions: List of channel names per time series.- Type:
list[list[str]]. - Shape: (N, C) — N samples, each with C channel names.
- Example:
[["engine", "temperature"], ["fan", "speed"]]for N=2, C=2.
- Type:
-
ts_array: List of time series values (one per sample).- Type:
list[list[list[float]]]. - Shape: (N, T, C) — N samples, each of T timesteps × C channels.
- All samples in a single request must have the same T and the same C.
- Example: one sample with 10 timesteps and 2 channels → a list of 10 rows, each row a list of 2 floats.
- Type:
Conventions:
- N = batch size (number of time series in the call).
- T = timesteps per series (same for all). Must be ≥ 1 and < 1500 (SDK enforces T < 1500).
- C = channels per series (same for all). Must be < 1500.
- N × C × T ≤ 500,000 per request (client may split into multiple requests via batching).
1. Embeddings — client.embeddings.create / client.embeddings.async_create
Endpoint: POST {base_url}/predict
| Input | descriptions (N×C), ts_array (N×T×C). See shared format above. |
| Output | response.embeds: one vector per time series. Shape (N, D) where D = embedding dimension (model-dependent). |
| Return type | EmbeddingsResponse: .embeds, .model, .usage, .raw. |
Use return_tensors="list", "np", or "torch" to get lists, a NumPy array, or a PyTorch tensor.
2. Forecast — client.prediction.create / client.prediction.async_create
Endpoint: POST {base_url}/forecast
Input: Same descriptions and ts_array as above, plus:
target_len(int, required, non-zero):- Positive → forecast that many steps ahead (e.g.
10= next 10 steps). - Negative → backcast that many steps in the past (e.g.
-8= last 8 steps).
- Positive → forecast that many steps ahead (e.g.
| Output | response.denormalized_predictions: predictions in original scale. Shape (N, abs(target_len), C, Q) where Q = number of quantiles (e.g. 21). |
| Also | response.predictions (normalized), response.data (input echo). Same batch dimension N. |
| Return type | ForecastResponse: .denormalized_predictions, .predictions, .data, .target_len, .mode ("forecast" or "backcast"), .raw. |
Use return_tensors="list", "np", or "torch" for all tensor fields.
Quick reference
| Functionality | Method | Input | Output shape (main) |
|---|---|---|---|
| Embeddings | embeddings.create / async_create |
descriptions (N×C), ts_array (N×T×C) |
(N, D) |
| Forecast | prediction.create / async_create |
Same + target_len > 0 |
(N, target_len, C, Q) |
| Backcast | prediction.create / async_create |
Same + target_len < 0 |
(N, abs(target_len), C, Q) |
Installation
Install from PyPI:
pip install c3-charm
To include the downstream-task toolkit (models, trainers, datasets):
pip install c3-charm[toolkit]
Or install from source with Poetry:
git clone https://github.com/c3ai/c3-charm.git
cd c3-charm
poetry install # core SDK only
poetry install --with toolkit # include toolkit dependencies
Dependencies
Core (installed by default):
requests— synchronous HTTP clienthttpx[http2]— asynchronous HTTP/2 clientpython-dotenv—.envfile loadingtqdm— progress bars
Toolkit (optional, pip install c3-charm[toolkit]):
torch,tensordict— tensor operationsnumpy,pandas— data manipulationmatplotlib,seaborn,scienceplots— visualizationscikit-learn— ML utilitieslightgbm,optuna— gradient boosting & hyperparameter tuninggin-config— experiment configuration
Quick Start
from charm import CharmClient
from dotenv import load_dotenv
import os
# Load environment variables from .env file
load_dotenv()
# Get API key and base URL from environment variables
api_key = os.getenv("CHARM_API_KEY", "your-api-key")
base_url = os.getenv("CHARM_BASE_URL", "http://your-server-url:8080")
# Create a client
client = CharmClient(
base_url=base_url,
api_key=api_key,
timeout=30, # Increased timeout for potentially large requests
max_retries=3, # Automatically retry failed requests
http2=True, # Enable HTTP/2 for async requests (default)
)
# Generate embeddings for time series data (synchronous with progress bar)
response = client.embeddings.create(
descriptions=[["engine", "temperature"], ["fan", "speed"]],
ts_array=[
# First time series (10 timesteps, 2 channels)
[
[0.1, 0.2], [0.3, 0.4], [0.5, 0.6], [0.7, 0.8], [0.9, 1.0],
[1.1, 1.2], [1.3, 1.4], [1.5, 1.6], [1.7, 1.8], [1.9, 2.0]
],
# Second time series (10 timesteps, 2 channels)
[
[2.1, 2.2], [2.3, 2.4], [2.5, 2.6], [2.7, 2.8], [2.9, 3.0],
[3.1, 3.2], [3.3, 3.4], [3.5, 3.6], [3.7, 3.8], [3.9, 4.0]
]
],
batch_size=32, # Process in batches of 32 (for large datasets)
return_tensors="np", # Options: "list", "np", "torch"
progress=True # Show progress bar (default: True)
)
# Access the embeddings
embeddings = response.embeds
print(f"Model: {response.model}")
print(f"Embeddings shape: {embeddings.shape}")
# Asynchronous processing (much faster for large datasets)
import asyncio
async def generate_embeddings_async():
response = await client.embeddings.async_create(
descriptions=[["engine", "temperature"], ["fan", "speed"]],
ts_array=[
# Same time series data as above
[
[0.1, 0.2], [0.3, 0.4], [0.5, 0.6], [0.7, 0.8], [0.9, 1.0],
[1.1, 1.2], [1.3, 1.4], [1.5, 1.6], [1.7, 1.8], [1.9, 2.0]
],
[
[2.1, 2.2], [2.3, 2.4], [2.5, 2.6], [2.7, 2.8], [2.9, 3.0],
[3.1, 3.2], [3.3, 3.4], [3.5, 3.6], [3.7, 3.8], [3.9, 4.0]
]
],
max_B_per_request=32, # Process 32 time series per API call
concurrency_per_call=8, # Run up to 8 concurrent API calls
return_tensors="np", # Options: "list", "np", "torch"
progress=True # Show progress bar (default: True)
)
return response
# Run the async function
response_async = asyncio.run(generate_embeddings_async())
Time Series Forecasting
The CHARM SDK also supports time series forecasting through the /forecast endpoint:
# Forecasting (predict future values)
response = client.prediction.create(
descriptions=[["sensor_A", "sensor_B"]],
ts_array=[[
[1.0, 2.0],
[1.1, 2.1],
[1.2, 2.2],
[1.3, 2.3],
[1.4, 2.4],
[1.5, 2.5],
[1.6, 2.6],
[1.7, 2.7],
[1.8, 2.8],
[1.9, 2.9],
]],
target_len=10, # Forecast 10 steps ahead
return_tensors="np"
)
# Access the denormalized predictions
forecast = response.denormalized_predictions
print(f"Forecast shape: {forecast.shape}") # e.g., (1, 10, 2, Q) where Q is number of quantiles
print(f"Mode: {response.mode}") # "forecast"
# Backcasting (reconstruct past values)
response = client.prediction.create(
descriptions=[["sensor_A", "sensor_B"]],
ts_array=[[
[1.0, 2.0],
[1.1, 2.1],
[1.2, 2.2],
[1.3, 2.3],
[1.4, 2.4],
[1.5, 2.5],
[1.6, 2.6],
[1.7, 2.7],
[1.8, 2.8],
[1.9, 2.9],
]],
target_len=-8, # Reconstruct last 8 steps
return_tensors="np"
)
reconstructed = response.denormalized_predictions
print(f"Mode: {response.mode}") # "backcast"
Note: target_len is required and must be non-zero:
- Positive values: Forecast future timesteps (e.g.,
target_len=10) - Negative values: Reconstruct past timesteps (e.g.,
target_len=-8)
Using a .env file
You can create a .env file in your project directory with the following content:
CHARM_API_KEY=your-api-key
CHARM_BASE_URL=http://your-server-url:8080
This allows you to keep your credentials separate from your code and avoid hardcoding sensitive information.
Features
- OpenAI-style SDK for CHARM time-series embeddings
- API key authentication
- Automatic retries with exponential backoff
- Configurable timeouts
- Client-side batching for large datasets
- Flexible return types (Python lists, NumPy arrays, or PyTorch tensors)
- Both synchronous and asynchronous methods in a single client:
client.embeddings.create()- Synchronous method with progress trackingawait client.embeddings.async_create()- Asynchronous method with concurrent batch processingclient.prediction.create()- Synchronous prediction methodawait client.prediction.async_create()- Asynchronous prediction method
- Progress tracking with tqdm for both sync and async methods
- HTTP/2 support for asynchronous requests
- Comprehensive error handling with specific exception types
- Binary protocol for efficient data transfer (handles raw fp16 bytes from server)
Performance Considerations
-
Synchronous Method (
client.embeddings.create): Suitable for smaller datasets or when simplicity is preferred. Processes batches sequentially, which can be slow for large datasets. Now includes progress tracking with tqdm. Avoid sending very large batches (>100 samples) in a single request to prevent timeouts. -
Asynchronous Method (
client.embeddings.async_create): Recommended for large datasets. Significantly faster due to concurrent processing with features like:- Parallel batch processing
- Bounded concurrency to avoid overwhelming the server
- Progress tracking for long-running operations
- HTTP/2 support for efficient connections
Payload limitations
The SDK and API enforce:
- Timesteps per series: T ≥ 1 and T < 1500 (enforced by SDK).
- Channels per series: C < 1500 (see usage guide).
- Per-request size: N × C × T ≤ 500,000 (client-side batching can split larger jobs).
- Batch consistency: All time series in a single request must have the same T and the same C.
See the Data shapes and API reference section above for input/output shapes.
Requirements
- Python 3.10+
- See Installation for dependency details
Testing
The CHARM SDK uses pytest for testing. To run the tests:
# Install pytest if not already installed
pip install pytest
# Run all tests
python -m pytest tests/
# Run specific test file
python -m pytest tests/test_utils.py
# Run with verbose output
python -m pytest -v tests/
Documentation
For detailed documentation, see the examples directory, the usage guide, the quickstart guide, and the docstrings in the code.
Example Applications
The CHARM SDK can be used for various time series applications:
- Anomaly Detection: Identify unusual patterns in sensor data, network traffic, or financial transactions
- Time Series Clustering: Group similar time series patterns for market segmentation or behavior analysis
- Classification: Categorize time series data for predictive maintenance or activity recognition
- Similarity Search: Find similar patterns across large datasets for pattern discovery
- Forecasting: Predict future values or reconstruct past values in time series data
Check out the notebooks in the docs/notebooks directory for detailed examples of these applications.
License
This project is licensed under the Apache License 2.0 — see the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file c3_charm-0.1.0.tar.gz.
File metadata
- Download URL: c3_charm-0.1.0.tar.gz
- Upload date:
- Size: 43.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.13.12 Darwin/24.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d6b478328f8a11fa91169883994f4c848ac8adf0717428a1616ba17aa8344d35
|
|
| MD5 |
8a3e6db6b6bf7913a2bf7eff69fcd001
|
|
| BLAKE2b-256 |
303c6f1f7cc348fc191c62e4c17231403dd878898ba47992d24ec1b8a64f9eea
|
File details
Details for the file c3_charm-0.1.0-py3-none-any.whl.
File metadata
- Download URL: c3_charm-0.1.0-py3-none-any.whl
- Upload date:
- Size: 45.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.13.12 Darwin/24.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a78333f195a2978a832df4e0816b6863c2abbc9c39280dff5064cfa4c5b53ed
|
|
| MD5 |
6a684a7db1aa2c9a4fb6c415da1b9053
|
|
| BLAKE2b-256 |
520fe4f6a261c8fcd5f06d8685fa865ebc9a3319b2732b60aa5d8d3cd540f358
|