A tool for agentic recursive model improvement
Project description
Tropiflo
Automatically evolve your ML code to maximize a KPI — locally, securely, and reproducibly.
Is Tropiflo for you?
Tropiflo is for you if:
✓ You already have working ML code — not starting from scratch
✓ You know your metric (KPI) — accuracy, RMSE, AUC, whatever you optimize for
✓ You want the system to rewrite parts of your code — to improve that metric
✓ You do NOT want AutoML SaaS, data upload, or black boxes — everything runs locally
If that's you, keep reading.
How Tropiflo Thinks
Here's what actually happens when you run Tropiflo:
- You mark a code block you want to evolve (e.g., your feature engineering)
- You define a KPI by printing it (e.g.,
print(f"KPI: {accuracy}")) - Tropiflo runs your baseline and records the KPI
- Tropiflo proposes a hypothesis about how to improve the code
- Tropiflo modifies ONLY the marked block with the new approach
- Tropiflo executes your full project to test the hypothesis
- Tropiflo scores the new KPI and keeps the change if it's better
- Repeat — the system keeps evolving toward higher KPIs
What Tropiflo is NOT
- Not AutoML — It doesn't just tune hyperparameters
- Not parameter search — It's code evolution, not grid search
- Not a black box — You see every change it makes to your code
- Not a data platform — Your data never leaves your machine
Quickstart: See it work in 2 minutes
The fastest way to understand Tropiflo is to watch it improve a simple problem.
Step 1: Install
pip install tropiflo
Step 2: Mark Your Code
Create train.py and mark the block you want to evolve:
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
# Load your data
X = pd.read_csv("data/features.csv")
y = pd.read_csv("data/labels.csv")
# CO_DATASCIENTIST_BLOCK_START
# This is the block Tropiflo will evolve
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X, y)
preds = model.predict(X)
# CO_DATASCIENTIST_BLOCK_END
# Print your KPI
accuracy = accuracy_score(y, preds)
print(f"KPI: {accuracy:.4f}")
Step 3: Create config.yaml
Minimal configuration:
mode: local
entry_command: "python train.py"
With more options:
mode: local
entry_command: "python train.py"
# Run multiple experiments in parallel
parallel: 3
# Mount external data directory
data_volume: "/path/to/your/data"
# AI evolution (get API key from tropiflo.io)
api_key: "sk_your_token_here"
Step 4: Run
tropiflo run --config config.yaml
Track runs live in a local dashboard:
# Launch workflow + Streamlit tracking UI
tropiflo run --config config.yaml --dashboard
# Optional: choose a different dashboard port
tropiflo run --config config.yaml --dashboard --dashboard-port 8502
# Launch dashboard later (without starting a new workflow)
tropiflo dashboard
The dashboard opens at http://127.0.0.1:8501 by default and reads local artifacts from results/runs/.
What you'll see:
- Baseline run with initial KPI
- Evolution hypotheses being tested
- Progress toward better KPIs
- Results saved to
results/runs/{memorable_name}/
Results: Traceable, Reproducible, Diffable
Every run is fully traceable and reproducible.
your_project/
└── results/
└── runs/
└── happy_panda_20260207_143025/ ← Memorable run name
├── timeline/ ← Chronological history
│ ├── 0001_kpi_0.8530_baseline/
│ ├── 0002_kpi_0.8812_hypothesis_ensemble/
│ └── 0003_kpi_0.9103_hypothesis_feature_eng/
├── by_performance/ ← Auto-sorted by KPI
└── best → timeline/0003... ← Symlink to best version
Key features:
timeline/shows every hypothesis tested, in orderby_performance/automatically sorts runs by KPI for easy comparisonbestsymlink always points to your best-performing version- Every checkpoint contains the full modified code + metadata
Important Reassurances
Your code outside the block is never modified
Tropiflo only touches code between CO_DATASCIENTIST_BLOCK_START and CO_DATASCIENTIST_BLOCK_END. Everything else stays exactly as you wrote it.
If KPI doesn't improve, baseline is preserved
Tropiflo only keeps changes that improve your KPI. If a hypothesis performs worse, it's discarded and the previous best version is kept.
You can Ctrl+C at any time safely
Press Ctrl+C anytime to stop. Docker images and containers are cleaned up automatically. No manual cleanup needed.
All artifacts are local unless you opt in
Your data, code, and results stay on your machine. Nothing is uploaded unless you explicitly configure a cloud backend.
Configuration
Minimal Config (80% of users)
mode: local
entry_command: "python train.py"
Common Options
mode: local
entry_command: "python train.py"
# Parallelization
parallel: 3
# Data mounting (if data is outside your project)
data_volume: "/home/user/datasets"
# API key for AI-powered evolution
api_key: "sk_your_token_here"
Resource Control (Advanced)
mode: local
entry_command: "python train.py"
parallel: 4
# GPU configuration
enable_gpu: true # Force GPU (auto-detected by default)
gpus_per_task: 1 # GPUs per container
# CPU and memory limits
cpus_per_task: 4.0 # CPU cores per container
memory_per_task: "8g" # Memory per container
Cloud Backends (Optional)
Google Cloud Run
mode: gcloud
entry_command: "python train.py"
project_id: "your-gcp-project"
region: "us-central1"
data_volume: "gs://your-bucket"
See full GCloud setup guide below.
AWS ECS Fargate
mode: aws
entry_command: "python train.py"
aws:
cluster: "my-cluster"
task_definition: "my-task"
region: "us-east-1"
See full AWS setup guide below.
Databricks
mode: databricks
entry_command: "python train.py"
databricks:
volume_uri: "dbfs:/Volumes/workspace/default/volume"
timeout: "30m"
See full Databricks setup guide below.
Using Your Data
After the dummy example works, here's how to use YOUR data:
Method 1: Hardcoded Paths (Simplest)
Just put the full path in your code:
import pandas as pd
X = pd.read_csv("/full/path/to/your/data.csv")
# ... rest of your code
Method 2: Docker Volume Mounting (Recommended)
For data that lives outside your project:
Update config.yaml:
mode: local
entry_command: "python train.py"
data_volume: "/home/user/my_datasets"
Update your code:
import os
import pandas as pd
# Tropiflo automatically sets INPUT_URI to /data inside Docker
DATA_DIR = os.environ.get("INPUT_URI", "/data")
X = pd.read_csv(os.path.join(DATA_DIR, "train.csv"))
y = pd.read_csv(os.path.join(DATA_DIR, "labels.csv"))
# CO_DATASCIENTIST_BLOCK_START
# Your model code here
# CO_DATASCIENTIST_BLOCK_END
print(f"KPI: {score}")
What happens: Tropiflo mounts /home/user/my_datasets to /data inside the Docker container, so your code can access files like train.csv.
Block Placement Rules
Block markers MUST be at top level (no indentation):
# ✅ CORRECT - No indentation before the comment
# CO_DATASCIENTIST_BLOCK_START
def my_model():
return LinearRegression()
# CO_DATASCIENTIST_BLOCK_END
# ❌ WRONG - Inside a function (has tabs/spaces before comment)
def train():
# CO_DATASCIENTIST_BLOCK_START ← This will NOT be detected!
model = train_model()
# CO_DATASCIENTIST_BLOCK_END
Rule: Block markers must start at column 0 (no tabs or spaces before #).
Multi-File Projects
Tropiflo supports both single-file scripts and multi-file projects:
- Single File:
tropiflo run python my_script.py - Multi-File: Auto-detects
run.sh,main.py, orrun.pyin your project root - Custom Entry Point:
tropiflo run bash custom_script.sh
When you run Tropiflo on a multi-file project:
- Scanning: Scans all
.pyfiles forCO_DATASCIENTIST_BLOCKmarkers - Selection: Each generation, randomly picks ONE file to evolve
- Evolution: The AI generates hypotheses and modifies the selected block
- Testing: Your entire project runs with the new code
- Checkpointing: Best results are saved as complete directories with all files
This means you can have complex multi-file ML pipelines where each file evolves independently but is tested as a complete system.
Deployment
Take your best checkpoint and create a production-ready project:
# Deploy best checkpoint from latest run
tropiflo deploy results/runs/happy_panda_20260207/best/
# Deploy specific version
tropiflo deploy results/runs/happy_panda_20260207/timeline/0003_kpi_0.9103_feature_eng/
# Custom output directory
tropiflo deploy results/runs/happy_panda_20260207/best/ --output-dir my_optimized_v2
What it does:
- Copies your entire original project (including data, configs, assets)
- Integrates the evolved code from the checkpoint
- Excludes Tropiflo artifacts (checkpoints, cache, etc.)
- Creates a
deployment_info.jsonwith checkpoint metadata
The result is a complete, standalone project ready to deploy to production.
Analysis Tools
Live Local Tracking Dashboard
Run with a live dashboard to monitor experiments as checkpoints are saved:
tropiflo run --config config.yaml --dashboard
Open the same dashboard anytime (even when no run is active):
# Reads ./results/runs by default
tropiflo dashboard
# Point to another project directory
tropiflo dashboard --working-directory /path/to/project
# Or pass an explicit results root and custom port
tropiflo dashboard --results-root /path/to/project/results/runs --dashboard-port 8502
Dashboard highlights:
- KPI over time (all runs as points + running best line)
- Baseline marker and best-so-far trajectory
- Hypotheses table across the workflow
- Diff viewer vs baseline per file
- Stdout/stderr per checkpoint
If you run multiple workflows, select and compare them from the dashboard sidebar.
Data is loaded from local results/runs/ folders, so old and new runs appear together.
Plot KPI Progression
Visualize how your KPI improves over iterations:
# Basic usage
tropiflo plot-kpi --checkpoints-dir results/runs/happy_panda_20260207/
# With options
tropiflo plot-kpi \
--checkpoints-dir results/runs/happy_panda_20260207/ \
--max-iteration 350 \
--title "AUC Training Progress" \
--kpi-label "AUC" \
--output my_kpi_plot.png
Generate PDF Code Diffs
Create professional PDF reports comparing two versions:
# Compare two Python files
tropiflo diff-pdf baseline.py improved.py
# With custom title
tropiflo diff-pdf \
baseline.py \
optimized.py \
--output "optimization_report.pdf" \
--title "XOR Problem Optimization Results"
Air-Gapped / Offline Deployment
Need to run Tropiflo in an environment without internet access?
Quick Setup (One-Time, Requires Internet)
# Run this once while connected to internet
tropiflo setup-airgap
# That's it! Now you can disconnect and work offline
What It Does
- Pulls base Python Docker image (one-time download)
- Builds complete image with all your dependencies pre-installed
- Updates your
config.yamlto use the pre-built image - Everything runs locally - no internet required after setup
After Setup
# Disconnect from internet (or work in isolated environment)
tropiflo run --config config.yaml # Works offline!
Perfect for:
- Air-gapped production environments
- Isolated VPC deployments
- High-security environments
- Offline development
Private/Self-Hosted Backend
If you run the backend on your own host (VPC, on-prem), point the CLI at it:
In config.yaml:
backend_url: "https://your-private-backend.example.com"
backend_url_dev: "http://localhost:8000" # Optional, for dev mode
Or with environment variables:
export CO_DATASCIENTIST_CO_DATASCIENTIST_BACKEND_URL="https://your-private-backend.example.com"
export CO_DATASCIENTIST_CO_DATASCIENTIST_BACKEND_URL_DEV="http://localhost:8000"
export CO_DATASCIENTIST_DEV_MODE=true # To force dev URL
If neither YAML nor env are set, the client defaults to https://co-datascientist.io.
Resource Allocation (GPU, CPU, Memory)
Control how much hardware each Docker container gets.
GPU Configuration
Auto-detection (default):
# No configuration needed - GPUs auto-detected!
# If available: containers get GPU access
# If not available: containers run on CPU automatically
Manual control:
enable_gpu: false # Force CPU-only (even if GPU available)
enable_gpu: true # Force GPU (fails if not available)
gpus_per_task: 1 # Each container gets 1 GPU
CPU & Memory Limits
cpus_per_task: 4.0 # Each container gets 4 CPU cores
memory_per_task: "8g" # Each container gets 8GB RAM
Common Scenarios
Single GPU Workstation:
entry_command: "python train.py"
parallel: 2
gpus_per_task: 1 # Each gets 1 GPU (total: 2 GPUs)
cpus_per_task: 4.0 # Each gets 4 cores (total: 8 cores)
memory_per_task: "8g" # Each gets 8GB (total: 16GB)
Multi-GPU Server:
entry_command: "python train.py"
parallel: 8
gpus_per_task: 1 # Each gets 1 GPU (total: 8 GPUs)
cpus_per_task: 2.0 # Each gets 2 cores (total: 16 cores)
memory_per_task: "4g" # Each gets 4GB (total: 32GB)
CPU-Only Machine:
entry_command: "python train.py"
parallel: 4
enable_gpu: false # Force CPU mode
cpus_per_task: 2.0 # Each gets 2 cores (total: 8 cores)
memory_per_task: "2g" # Each gets 2GB (total: 8GB)
Before vs After Example
| Before KPI ≈ 0.50 |
After KPI 1.00 |
|---|---|
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score
import numpy as np
# XOR data
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([0, 1, 1, 0])
pipeline = Pipeline([
('scaler', StandardScaler()),
('clf', RandomForestClassifier(n_estimators=10, random_state=0))
])
pipeline.fit(X, y)
preds = pipeline.predict(X)
accuracy = accuracy_score(y, preds)
print(f'KPI: {accuracy:.4f}')
|
import numpy as np
from sklearn.base import TransformerMixin, BaseEstimator
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score
class ChebyshevPolyExpansion(BaseEstimator, TransformerMixin):
def __init__(self, degree=3):
self.degree = degree
def fit(self, X, y=None):
return self
def transform(self, X):
X = np.asarray(X)
X_scaled = 2 * X - 1
n_samples, n_features = X_scaled.shape
features = []
for f in range(n_features):
x = X_scaled[:, f]
T = np.empty((self.degree + 1, n_samples))
T[0] = 1
if self.degree >= 1:
T[1] = x
for d in range(2, self.degree + 1):
T[d] = 2 * x * T[d - 1] - T[d - 2]
features.append(T.T)
return np.hstack(features)
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([0, 1, 1, 0])
pipeline = Pipeline([
('cheb', ChebyshevPolyExpansion(degree=3)),
('scaler', StandardScaler()),
('clf', RandomForestClassifier(n_estimators=10, random_state=0))
])
pipeline.fit(X, y)
preds = pipeline.predict(X)
accuracy = accuracy_score(y, preds)
print(f'KPI: {accuracy:.4f}')
|
Cloud Integrations
Google Cloud Run Jobs Integration
Execute your code at scale on Google Cloud infrastructure.
Prerequisites (One-Time, 5 Minutes)
- Install & authenticate gcloud CLI:
# Install gcloud CLI (if not installed)
# See: https://cloud.google.com/sdk/docs/install
# Authenticate
gcloud auth login
gcloud auth application-default login
# Set your project
gcloud config set project YOUR_PROJECT_ID
- Enable required APIs:
gcloud services enable artifactregistry.googleapis.com
gcloud services enable run.googleapis.com
- Create Artifact Registry repository:
gcloud artifacts repositories create co-datascientist-repo \
--repository-format=docker \
--location=us-central1 \
--description="Docker images for Co-DataScientist"
Configuration
Minimal config.yaml for GCloud:
mode: gcloud
entry_command: "python train.py"
project_id: "your-gcp-project-id"
With options:
mode: gcloud
entry_command: "python train.py"
project_id: "your-gcp-project-id"
# Optional
region: "us-central1"
repo: "co-datascientist-repo"
parallel: 2
data_volume: "gs://your-bucket"
api_key: "sk_your_token"
What Happens
When you run tropiflo run --config config.yaml:
- Builds your Docker image locally
- Pushes to GCP Artifact Registry
- Creates & executes Cloud Run Job
- Retrieves results and KPIs
- Cleans up resources automatically
Cost efficient: Cleans up jobs and images automatically (configurable with cleanup_job and cleanup_remote_image)
Using Data from GCS
mode: gcloud
project_id: "my-project"
entry_command: "python train.py"
data_volume: "gs://my-data-bucket"
Your code accesses data at /data:
import os
DATA_DIR = os.environ.get("INPUT_URI", "/data")
df = pd.read_csv(os.path.join(DATA_DIR, "train.csv"))
Note: Your Cloud Run service account needs storage.objectViewer permission on the bucket.
AWS ECS Fargate Integration
Execute and optimize your Python code at scale using AWS ECS Fargate.
Setup
-
Prerequisites:
- AWS account with ECS Fargate enabled
- Authenticated AWS CLI:
aws configure - An ECS cluster and task definition configured for your needs
-
Create config.yaml:
mode: aws
entry_command: "python train.py"
aws:
script_path: "/path/to/your/script.py"
cluster: "my-cluster"
task_definition: "my-job-taskdef"
launch_type: "FARGATE"
region: "us-east-1"
network_configuration:
subnets: ["subnet-abc123", "subnet-def456"]
security_groups: ["sg-123456"]
assign_public_ip: "ENABLED"
timeout: 1800 # seconds
- Run:
tropiflo run --config config.yaml
Your code will be executed in AWS ECS Fargate containers, with results and KPIs retrieved automatically. Perfect for serverless compute scaling!
Databricks Integration
Execute your code on Databricks infrastructure.
Setup
- Download the Databricks CLI package:
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sudo sh
-
Get a Databricks token and test the CLI: Get your Databricks token here
-
Create config.yaml:
mode: databricks
entry_command: "python train.py"
databricks:
cli: "databricks"
volume_uri: "dbfs:/Volumes/workspace/default/volume"
code_path: "dbfs:/Volumes/workspace/default/volume/train.py"
timeout: "30m"
job:
name: "run-<script-stem>-<timestamp>"
tasks:
- task_key: "t"
spark_python_task:
python_file: "<remote_path>"
environment_key: "default"
environments:
- environment_key: "default"
spec:
client: "1"
dependencies:
- "scikit-learn>=1.0.0"
- "numpy>=1.20.0"
- Run:
tropiflo run --config config.yaml
Your optimized model results will save to the Databricks volume at the configured path.
Important Notes
- Avoid
input()or interactive prompts — Tropiflo needs to run your code automatically - Mark the parts you want to evolve — Use
CO_DATASCIENTIST_BLOCK_STARTandCO_DATASCIENTIST_BLOCK_END - Add comments with context — Tropiflo understands your domain! Explain your problem, constraints, and ideas in comments near your code
Naming Note
"Co-DataScientist" is the internal engine behind Tropiflo.
You only interact with the Tropiflo CLI. If you see references to "Co-DataScientist" in code, logs, or config keys, that's the underlying system. They're the same product.
Need Help?
We'd love to chat: oz.kilim@tropiflo.io
Disclaimer: Tropiflo executes your scripts on your own machine. Make sure you trust the code you feed it!
Made by the Tropiflo team
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tropiflo-1.1.5.tar.gz.
File metadata
- Download URL: tropiflo-1.1.5.tar.gz
- Upload date:
- Size: 476.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
24b61a31d77336873d2ed527ff36c6907e4029b5e35a0338dd2d9b8ef276e1fc
|
|
| MD5 |
171d45fd31170282f0103892138b5e69
|
|
| BLAKE2b-256 |
7e883dcfd604d7c40817ed3cdc9edf0e876cbdc9e38c4e954d9519b5aec90f8f
|
File details
Details for the file tropiflo-1.1.5-py3-none-any.whl.
File metadata
- Download URL: tropiflo-1.1.5-py3-none-any.whl
- Upload date:
- Size: 109.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d02afc339cd13c13d0a22bed1fe24d51d8123624e0220f43d9e9c99eb3e68ded
|
|
| MD5 |
e1bc302f6a3cfc01f47cae12f032a89c
|
|
| BLAKE2b-256 |
11625cbac5f16758dbd3336cc6de81b30c7882cd662b26c7dbc39cf4d87e196a
|