TinyML continuous training agent — dataset in, model out, deploy, monitor, auto-retrain

These details have not been verified by PyPI

Project links

Project description

CoralFlow

The AI-Driven TinyML Pipeline.

Seamlessly train, validate, and deploy models to edge devices via CLI. With built-in drift detection, auto-retraining, and HTTP-based updates, our LLM agent takes care of the heavy lifting—managing your entire workflow automatically.

TensorFlow · TensorFlow Lite · Vertex AI · Arize Phoenix · OpenInference · DeepSeek · Gemini

Version: 0.4.0 · License: MIT

Usage guide

The recommended path is the interactive agent. It walks you through dataset discovery, training, optional cloud/observability setup, and retraining — in plain language.

Requirements: Python 3.10+, Linux or macOS (WSL2 works). First install pulls TensorFlow and may take a few minutes.

Install

python3 -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install coralflow
coralflow                   # launch the interactive agent

Optional: copy settings template before first run (cp .env.example .env in a project directory). The agent can also save LLM and cloud settings to .env interactively.

From source (contributors):

git clone https://github.com/Aifar/coralflow.git
cd coralflow
cp .env.example .env
./scripts/dev pip install -e ".[dev]"
coralflow

scripts/dev creates .venv automatically (uv venv if available, else python3 -m venv).

1. Launch the agent

Run coralflow with no subcommand (or coralflow agent). You enter a REPL:

coralflow>

On first launch the agent scans local datasets and trained models, then prints a short summary. Type /help for slash commands or just describe what you want in natural language.

2. Configure LLM (required)

The agent must have a working LLM before it can orchestrate training. On first run you will be prompted to:

Choose a provider — DeepSeek, Gemini, or other OpenAI-compatible API
Set endpoint — defaults are shown; press Enter or edit
Set API key — saved to .env as CORALFLOW_LLM_API_KEY
Set model — e.g. deepseek-chat, gemini-2.0-flash, gpt-4o

You can also preconfigure in .env:

CORALFLOW_LLM_API_KEY=sk-...
# CORALFLOW_LLM_ENDPOINT=https://api.deepseek.com/v1
# CORALFLOW_LLM_MODEL=deepseek-chat

Or pass flags once: coralflow --api-key sk-... --endpoint ... --model ...

3. Describe the model you want to train

Tell the agent your task in natural language. For example:

coralflow> I want a text classifier that tells whether support messages are urgent or not

The agent will:

Scan for data — built-in datasets (urgent, expense), CSV files on disk, and paths you mention
Recommend public datasets from the web if nothing local fits your task
Ask you to pick a built-in dataset, download one (coralflow init download urgent -o ./data), or point to your own CSV (text + label columns)

Example follow-ups you might see:

coralflow> Use the built-in urgent dataset
coralflow> Train on ./data/my_labels.csv — text column is "message", label is "category"

Built-in datasets:

Name	Modality	Samples	Classes	Description
urgent	text	400	4	Message urgency classification
expense	text	500	5	Personal spending intent classification

Download a built-in dataset without the agent:

coralflow init download urgent -o ./data

4. Validate data and choose local or cloud training

Before training, the agent validates dataset quality (columns, class balance, samples) and assesses your machine (CPU, RAM, disk).

It then asks you to choose:

Option	When to use	Extra setup
1 — Local training	Laptop/edge-friendly text models; no GCP bill	None
2 — Cloud training	Large data, AutoML, or Gemini fine-tuning on Vertex AI	Google Cloud (below)

Reply with 1 or 2. Local training uses TensorFlow on your machine; cloud training uses Vertex AI (train --cloud).

Google Cloud (only for cloud training) — when you pick option 2, or at startup when prompted, configure:

GCP_PROJECT=your-gcp-project-id
GCP_LOCATION=us-central1
GCP_STAGING_BUCKET=gs://your-staging-bucket
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

Or run gcloud auth application-default login instead of a service-account file.

Used by: train --cloud, coralflow models list, coralflow cost, Vertex predict/deploy.

After training completes, the agent can validate the model (TFLite size/latency) and deploy to edge gateways if EDGE_DEVICES is set in .env.

5. Optional: Arize Phoenix logging

If you want prediction traces and monitoring in Arize Phoenix, configure Phoenix when the agent asks (after LLM setup), or add to .env:

Local Phoenix:

pip install arize-phoenix
phoenix serve
PHOENIX_COLLECTOR_ENDPOINT=http://localhost:6006/v1/traces
PHOENIX_PROJECT_NAME=edge-train

Phoenix Cloud:

PHOENIX_API_KEY=your-phoenix-api-key
PHOENIX_COLLECTOR_ENDPOINT=https://app.phoenix.arize.com/v1/traces
PHOENIX_PROJECT_NAME=edge-train

Once configured, each predict emits OpenInference OTEL spans. The agent's check_monitoring tool verifies Phoenix is reachable before monitor/predict workflows.

Skip this step if you only need local train/predict without a trace dashboard.

6. Monitor the model and retrain

After you have a trained model:

Run predictions — the agent calls predict, or use CLI:

coralflow predict --model ./model_output --text "need this done today"
coralflow predict --model ./model_output --csv holdout.csv

Results append to prediction_log.jsonl (override with EDGE_PREDICTION_LOG_PATH).

Check monitoring — ask the agent "Is Phoenix receiving traces?" or run:
```
coralflow monitor --status
```
Label mistakes — the agent uses label_predictions to list unlabeled rows; you correct wrong predictions (add ground_truth to the log).
Retrain when accuracy drops — ask "Should we retrain?" or run:
```
coralflow monitor --retrain --dataset ./data/urgent.csv
```
The agent flow: list unlabeled predictions → you label errors → check_retrain merges labels and retrains if accuracy is below threshold.

CLI reference

You can run every step without the agent:

Step	Command	Description
Setup	`init list` / `init download`	Built-in datasets or your own CSV
Train	`train -d <csv> [-o dir]`	Local TF Keras (default); `--cloud` for Vertex AI
Validate	`validate --model <path>`	TFLite conversion, size & latency checks
Predict	`predict --model <path> --text "..."`	Local inference; logs to `prediction_log.jsonl`
Deploy	`deploy --model <path> [--device <id>]`	Push TFLite to gateways in `EDGE_DEVICES`
Edge pipeline	`coralflow-edge-pipeline ./data/urgent.csv`	Train → validate → deploy in one command
Monitor	`monitor`	Phoenix OTEL tracing & retrain triggers
Cost	`cost <dataset>`	Estimate Vertex AI training cost
Agent	`coralflow` / `agent`	LLM REPL with tools for the full workflow
Demo	`demo retrain-loop`	Scripted drift → retrain → accuracy improvement

Slash commands in the agent (e.g. /datasets, /help) run locally first; long train / validate jobs stream live output in the terminal.

Configuration

Copy the template and uncomment what you need:

cp .env.example .env

See .env.example for full commented examples.

Variable	Description
`CORALFLOW_LLM_API_KEY`	LLM API key (required for agent)
`CORALFLOW_LLM_ENDPOINT`	OpenAI-compatible API base URL
`CORALFLOW_LLM_MODEL`	Model name (default: `gpt-4o`)
`GCP_PROJECT`	Google Cloud project (cloud training)
`GCP_LOCATION`	GCP region (default: `us-central1`)
`GCP_STAGING_BUCKET`	GCS bucket for cloud datasets
`GOOGLE_APPLICATION_CREDENTIALS`	Path to GCP service account JSON (optional with ADC)
`PHOENIX_API_KEY`	Phoenix Cloud API key
`PHOENIX_COLLECTOR_ENDPOINT`	OTEL trace endpoint (local or cloud)
`PHOENIX_PROJECT_NAME`	Project name in Phoenix (default: `edge-train`)
`EDGE_DEVICES`	Edge gateways — JSON array or `id@host:port,...`
`EDGE_DEFAULT_DEVICE`	Default gateway id for `coralflow deploy`
`EDGE_PREDICTION_LOG_PATH`	Prediction log path (default: `./prediction_log.jsonl`)

Development

Contributors (pytest, black, Flask gateway dev deps):

make install    # ./scripts/dev pip install -e ".[dev]"
make test
make format-check

Before pushing, run make lint (format-check + test).

Install the latest published release:

pip install -U coralflow

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.4.0

Jun 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coralflow-0.4.0.tar.gz (120.8 kB view details)

Uploaded Jun 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

coralflow-0.4.0-py3-none-any.whl (112.8 kB view details)

Uploaded Jun 2, 2026 Python 3

File details

Details for the file coralflow-0.4.0.tar.gz.

File metadata

Download URL: coralflow-0.4.0.tar.gz
Upload date: Jun 2, 2026
Size: 120.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for coralflow-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`e52eb98a552101a1a12e75f3e71d30e4484d120106c0b07d75fcb66ec6f4493e`
MD5	`f0f06c8528c2085d58e224479a885e79`
BLAKE2b-256	`ebeb4dfe4c8581c68d59c60872f0f7bdea0085fdf6a399145c8a70ea2c09e71f`

See more details on using hashes here.

File details

Details for the file coralflow-0.4.0-py3-none-any.whl.

File metadata

Download URL: coralflow-0.4.0-py3-none-any.whl
Upload date: Jun 2, 2026
Size: 112.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for coralflow-0.4.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`42856a5b220ee88cf8e1c4c679cbc805ef60dde1731db242fd0a13d1882c377d`
MD5	`a79c64acb70cd51017b524ed0dc0e58d`
BLAKE2b-256	`14423e99d9875f848e6b147320da20252c79422791d23e0c2a70175d2acc844e`

See more details on using hashes here.

coralflow 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CoralFlow

Usage guide

Install

1. Launch the agent

2. Configure LLM (required)

3. Describe the model you want to train

4. Validate data and choose local or cloud training

5. Optional: Arize Phoenix logging

6. Monitor the model and retrain

CLI reference

Configuration

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes