A tool for agentic recursive model improvement
Project description
Introducing the Co-DataScientist
Beat the competition.
Why is everyone talking about the Co-DataScientist
- Idea Explosion — Launches a swarm of models, feature recipes & hyper-parameters you never knew existed.
- Full-Map Exploration — Charts the entire optimization galaxy so you can stop guessing and start winning.
- Hands-Free Mode — Hit run and the search party works through the night.
- KPI Fanatic — Every evolutionary step is focused on improving your target metric.
- Data Stays Home — Your training and testing data never leaves your server; everything runs locally.
Quickstart — 60-Second Setup
1. Install
pip install co-datascientist
2. Prepare Your Project
Co-DataScientist works with both single files and multi-file projects. Just mark the code you want to evolve with special blocks.
Single File Example (e.g. xor.py)
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
import numpy as np
# XOR toy-set
X = np.array([[0,0],[0,1],[1,0],[1,1]])
y = np.array([0,1,1,0])
# CO_DATASCIENTIST_BLOCK_START
pipe = Pipeline([
("scale", StandardScaler()),
("clf", LogisticRegression(random_state=0))
])
pipe.fit(X, y)
acc = accuracy_score(y, pipe.predict(X))
# CO_DATASCIENTIST_BLOCK_END
print(f"KPI: {acc:.4f}") # Tag your metric!
# comments
# This is the classic XOR problem — it's not linearly separable!
# A linear model like LogisticRegression can't solve it perfectly,
# because no straight line can separate the classes in 2D.
# This makes it a great test for feature engineering or non-linear models.
Test that it runs:
python xor.py # Should print "KPI: 0.5000"
Multi-File Project Example
For larger projects, organize your code across multiple files and use a run script:
Project structure:
my_ml_project/
├── main.py # Orchestration
├── data_loader.py # Data processing
├── model.py # Model training
└── run.sh # How to run everything
main.py:
from data_loader import load_data
from model import train_model
# CO_DATASCIENTIST_BLOCK_START
def run_pipeline():
data = load_data()
score = train_model(data)
return score
# CO_DATASCIENTIST_BLOCK_END
if __name__ == "__main__":
result = run_pipeline()
print(f"KPI: {result}")
data_loader.py:
import numpy as np
# CO_DATASCIENTIST_BLOCK_START
def load_data():
# Your data loading logic
return np.array([1, 2, 3, 4, 5])
# CO_DATASCIENTIST_BLOCK_END
model.py:
import numpy as np
# CO_DATASCIENTIST_BLOCK_START
def train_model(data):
# Your model training logic
return np.mean(data) * 10
# CO_DATASCIENTIST_BLOCK_END
run.sh:
#!/bin/bash
python main.py
Test your project:
bash run.sh # Should print "KPI: 30.0"
Note: Co-DataScientist auto-detects
run.sh,main.py, orrun.py. For custom entry points, use the simple syntax:co-datascientist run bash your_script.sh
3. Set your API Token (one time only!)
Before running any commands, you need to set your Co-DataScientist API token. You only need to do this once per machine.
co-datascientist set-token --token <YOUR_TOKEN>
4. Run Co-DataScientist
Simple syntax - just wrap your normal command:
# Single file
co-datascientist run python xor.py
# Multi-file project (auto-detects run.sh or main.py)
cd my_ml_project
co-datascientist run python main.py
With options (put them BEFORE the command):
# Run 3 versions in parallel
co-datascientist run --parallel 3 python xor.py
# Skip interactive Q&A with cached answers
co-datascientist run --use-cached-qa python main.py
# Complex commands with arguments
co-datascientist run --parallel 5 python train.py --epochs 100 --lr 0.001
# Use -- separator if needed for clarity
co-datascientist run --parallel 3 -- bash train.sh --gpu --batch-size 32
With config file for local or cloud execution:
# Local execution with volume mounting
co-datascientist run --config config.yaml
# Cloud execution (GCloud, AWS, Databricks)
co-datascientist run --config gcloud_config.yaml
Pro tips:
- Use
--parallel Nto evolve multiple code versions simultaneously!- Auto-detects
run.sh,main.py, orrun.pyin your project root- Your data never leaves your machine - everything runs locally
Watch your KPI improve over generations. You'll find optimized checkpoints in the co_datascientist_checkpoints/ directory.
Each checkpoint is saved as a separate directory containing all evolved files:
co_datascientist_checkpoints/
└── best_0_baseline/
├── main.py
├── data_loader.py
├── model.py
└── metadata.json # Contains KPI and other info
5. Deploy Your Best Checkpoint
Once you find a checkpoint you like, deploy it as a complete, ready-to-use project:
# Deploy the best checkpoint
co-datascientist deploy co_datascientist_checkpoints/best_6_explore/
# Or with a custom name
co-datascientist deploy best_6_explore/ --output-dir my_optimized_pipeline
This creates a complete copy of your project with the evolved code integrated, ready to use:
my_ml_project_deployed_best_6_explore_20251002_123456/
├── main.py # Evolved! ✨
├── data_loader.py # Evolved! ✨
├── model.py # Evolved! ✨
├── run.sh # Original
├── data/ # Original
├── configs/ # Original
└── deployment_info.json # Checkpoint metadata
The deployed project is completely standalone - just copy it to production and run it!
Yes, it's that simple
Try it on your toughest problem and see how your KPI improves.
Co-DataScientist helps you get better results—no matter how big your challenge.
Important Notes About Your Input Script
KPI Tagging
Co-DataScientist scans your stdout for the pattern KPI: <number> — that’s the metric it maximizes. Use anything: accuracy, F1, revenue per click, unicorns-per-second… you name it!
📁 Data Access Patterns
Option 1: Hardcoded Paths (Simple)
For quick local runs, hardcode data paths directly in your script:
data = np.loadtxt("/full/path/to/my_data.csv")
Option 2: Volume Mounting (Recommended for Docker)
For Docker-based execution, use environment variables with volume mounting:
config.yaml:
mode: "local"
data_volume: "/path/to/your/data" # Host directory containing data
Your code:
import os
INPUT_URI = os.environ.get("INPUT_URI", "/default/path")
data = pd.read_csv(os.path.join(INPUT_URI, "train.csv"))
When you run with --config config.yaml, Co-DataScientist automatically:
- Mounts your data directory into the Docker container
- Sets the
INPUT_URIenvironment variable - Your code accesses data without hardcoding paths
Note: Avoid
input()or interactive prompts - Co-DataScientist needs to run your code automatically.
🧬 Blocks to evolve
As you will see in the XOR example, Co-DataScientist uses # CO_DATASCIENTIST_BLOCK_START and # CO_DATASCIENTIST_BLOCK_END tags to identify the parts of the system you want it to improve. Make sure to tag parts of your system you care about improving! It will help to Co-DataScientist stay focused on its job.
📂 Project Structure
Co-DataScientist supports both single-file scripts and multi-file projects:
- Single File:
co-datascientist run python my_script.py - Multi-File: Auto-detects
run.sh,main.py, orrun.pyin your project root - Custom Entry Point: Just wrap your command:
co-datascientist run bash custom_script.sh
The system automatically detects which files contain CO_DATASCIENTIST_BLOCK markers and evolves them intelligently.
Add Domain-Specific Notes for Best Results
After your code, add comments with any extra context, known issues, or ideas you have about your problem. This helps Co-DataScientist understand your goals and constraints! The Co-Datascientist UNDERSTANDS your problem. It's not just doing a blind search!
🎯 How Multi-File Evolution Works
When you run Co-DataScientist on a multi-file project:
- Scanning: It scans all
.pyfiles in your project forCO_DATASCIENTIST_BLOCKmarkers - Selection: Each generation, it randomly picks ONE file to evolve
- Evolution: The AI generates hypotheses and modifies the selected block
- Stitching: Modified code is integrated back into your full project
- Testing: Your entire project runs with the new code using your
run.shor custom command - Checkpointing: Best results are saved as complete directories with all files
This means:
- ✅ You can have complex multi-file ML pipelines
- ✅ Each file evolves independently but is tested as a complete system
- ✅ Your project structure is preserved
- ✅ Dependencies between files are maintained
Example Evolution Flow:
Generation 1: Evolve model.py → Test full project → KPI: 30.0
Generation 2: Evolve data_loader.py → Test full project → KPI: 45.0
Generation 3: Evolve main.py → Test full project → KPI: 60.0
Other helpful stuff
Skip Q&A on Repeat Runs
For faster iterations, use cached answers from your previous run:
co-datascientist run --use-cached-qa python xor.py
This skips the interactive questions and uses your previous answers, jumping straight to the optimization process.
Deploy Checkpoints to Production
The deploy command makes it easy to take your best checkpoint and create a production-ready project:
# Basic usage - auto-detects original project location
co-datascientist deploy co_datascientist_checkpoints/best_6_explore/
# Specify original project path manually
co-datascientist deploy best_6_explore/ --original-path /path/to/my_project
# Use custom output directory name
co-datascientist deploy best_6_explore/ --output-dir my_optimized_v2
What it does:
- Copies your entire original project (including data, configs, assets)
- Integrates the evolved code from the checkpoint
- Excludes Co-DataScientist artifacts (checkpoints, cache, etc.)
- Creates a
deployment_info.jsonwith checkpoint metadata
The result is a complete, standalone project ready to deploy to production!
📝 Before vs After
| Before KPI ≈ 0.50 |
After KPI 1.00 |
|---|---|
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score
import numpy as np
# XOR data
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([0, 1, 1, 0])
pipeline = Pipeline([
('scaler', StandardScaler()),
('clf', RandomForestClassifier(n_estimators=10, random_state=0))
])
pipeline.fit(X, y)
preds = pipeline.predict(X)
accuracy = accuracy_score(y, preds)
print(f'Accuracy: {accuracy:.2f}')
print(f'KPI: {accuracy:.4f}')
|
import numpy as np
from sklearn.base import TransformerMixin, BaseEstimator
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score
from tqdm import tqdm
class ChebyshevPolyExpansion(BaseEstimator, TransformerMixin):
def __init__(self, degree=3):
self.degree = degree
def fit(self, X, y=None):
return self
def transform(self, X):
X = np.asarray(X)
X_scaled = 2 * X - 1
n_samples, n_features = X_scaled.shape
features = []
for f in tqdm(range(n_features), desc='Chebyshev features'):
x = X_scaled[:, f]
T = np.empty((self.degree + 1, n_samples))
T[0] = 1
if self.degree >= 1:
T[1] = x
for d in range(2, self.degree + 1):
T[d] = 2 * x * T[d - 1] - T[d - 2]
features.append(T.T)
return np.hstack(features)
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([0, 1, 1, 0])
pipeline = Pipeline([
('cheb', ChebyshevPolyExpansion(degree=3)),
('scaler', StandardScaler()),
('clf', RandomForestClassifier(n_estimators=10, random_state=0))
])
pipeline.fit(X, y)
preds = pipeline.predict(X)
accuracy = accuracy_score(y, preds)
print(f'Accuracy: {accuracy:.2f}')
print(f'KPI: {accuracy:.4f}')
|
We now support Databricks
Databricks setup
- Download the databricks CLI package
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sudo sh
- Get a databricks token : and test the CLI works Get your Databricks token here
- Prepare a config file with all of your compute/environmental requirements in databricks_config.yaml example below
# Enable Databricks integration
databricks: true
# Databricks configuration for XOR demo
databricks:
cli: "databricks" # databricks CLI command (optional, defaults to "databricks")
volume_uri: "dbfs:/Volumes/workspace/default/volume" # DBFS volume URI for file uploads
code_path: "dbfs:/Volumes/workspace/default/volume/xor.py" # Specific code path (optional, overrides volume_uri + temp filename)
timeout: "30m" # Job timeout duration
job:
name: "run-<script-stem>-<timestamp>" # Job name template (supports <script-stem> and <timestamp>)
tasks:
- task_key: "t"
spark_python_task:
python_file: "<remote_path>" # Will be automatically replaced with actual remote path
environment_key: "default"
environments:
- environment_key: "default"
spec:
client: "1"
dependencies:
- "scikit-learn>=1.0.0"
- "numpy>=1.20.0"
Then run the co-datascientist with:
co-datascientist run --config databricks_config.yaml
Now your new optimized model checkpoints will save in : dbfs:/Volumes/workspace/default/volume/co-datascientist-checkpoints
🐳 Local Docker Execution with Volume Mounting
Run your code in Docker containers locally with automatic data volume mounting. Perfect for reproducible environments and large datasets.
Setup
- Create a config file (e.g.,
config.yaml):
mode: "local"
data_volume: "/absolute/path/to/your/data" # Host directory with your data files
parallel: 1 # Number of parallel executions
- Update your code to use environment variables:
import os
import pandas as pd
# Co-DataScientist automatically sets INPUT_URI to /data in the container
INPUT_URI = os.environ.get("INPUT_URI")
df = pd.read_csv(os.path.join(INPUT_URI, "train.csv"))
- Add a Dockerfile to your project:
FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "your_script.py"]
- Run Co-DataScientist:
co-datascientist run --working-directory . --config config.yaml
What Happens
Co-DataScientist will:
- Build a Docker image from your project
- Mount your
data_volumedirectory to/datainside the container - Set the
INPUT_URI=/dataenvironment variable automatically - Execute your code in the container with access to your data
- Extract KPIs and manage the evolution process
Benefits
- Reproducible: Same environment every time
- Isolated: Dependencies don't conflict with your system
- Scalable: Easy to move to cloud later with minimal changes
- Clean: No need to copy large datasets into Docker images
📖 See the complete demo: /demos/docker_demo/
☁️ Google Cloud Run Jobs Integration
Execute and optimize your Python code at scale using Google Cloud Run Jobs.
Setup
-
Prerequisites:
- Google Cloud project with Cloud Run enabled
- Authenticated gcloud CLI:
gcloud auth login - A Cloud Run Job template (e.g.,
test-job-clean)
-
Create a config file (e.g.,
gcloud_config.yaml):
mode: "gcloud"
project_id: "my-gcp-project"
region: "us-central1"
repo: "co-datascientist-repo"
job_name: "co-datascientist-job"
cleanup_job: false
cleanup_remote_image: true
# Optional: Mount GCS bucket for data access
data_volume: "gs://my-data-bucket" # or just "my-data-bucket"
For GCS volume mounting:
- Specify your GCS bucket in
data_volume(with or withoutgs://prefix) - Your bucket will be mounted to
/datainside the Cloud Run container INPUT_URI=/dataenvironment variable is automatically set- Your code accesses data the same way as local Docker execution
Your code:
import os
INPUT_URI = os.environ.get("INPUT_URI")
df = pd.read_csv(os.path.join(INPUT_URI, "data.csv"))
- Run Co-DataScientist:
co-datascientist run --config gcloud_config.yaml
Your code will be executed in Google Cloud Run Jobs, with results and KPIs retrieved automatically. Perfect for scaling compute-intensive optimizations!
Important Notes for GCS Volume Mounting:
- Requires Cloud Run 2nd generation execution environment
- Your Cloud Run service account needs
storage.objectViewerpermission on the bucket - GCS buckets are mounted read-only via Cloud Storage FUSE
- Files appear as regular files in the container at
/data
📖 See the complete demo: /demos/gcloud/
☁️ AWS ECS Fargate Integration
Execute and optimize your Python code at scale using AWS ECS Fargate.
Setup
-
Prerequisites:
- AWS account with ECS Fargate enabled
- Authenticated AWS CLI:
aws configure - An ECS cluster and task definition configured for your needs
-
Create a config file (e.g.,
aws_config.yaml):
aws:
enabled: true
script_path: "/path/to/your/script.py"
cluster: "my-cluster"
task_definition: "my-job-taskdef"
launch_type: "FARGATE"
region: "us-east-1"
network_configuration:
subnets: ["subnet-abc123", "subnet-def456"]
security_groups: ["sg-123456"]
assign_public_ip: "ENABLED"
timeout: 1800 # seconds
- Run Co-DataScientist:
co-datascientist run --config aws_config.yaml
Your code will be executed in AWS ECS Fargate containers, with results and KPIs retrieved automatically. Perfect for serverless compute scaling!
Analysis and Visualization Tools
Co-DataScientist includes built-in visualization tools to help you analyze your optimization results and compare different versions of your code.
Plot KPI Progression
Visualize how your KPI improves over iterations from checkpoint JSON files:
# Basic usage - plot KPI progression from checkpoints directory
co-datascientist plot-kpi --checkpoints-dir /path/to/co_datascientist_checkpoints
# Advanced usage with custom options
co-datascientist plot-kpi \
--checkpoints-dir /path/to/checkpoints \
--max-iteration 350 \
--title "AUC Training Progress" \
--kpi-label "AUC" \
--output my_kpi_plot.png
Options:
--checkpoints-dir, -c: Directory containing checkpoint JSON files (required)--max-iteration, -m: Maximum iteration to include in plot--title, -t: Custom title for the plot--output, -o: Output file path (auto-generated if not specified)--kpi-label, -k: Label for the KPI metric (default: "RMSE")
Generate PDF Code Diffs
Create beautiful PDF reports comparing two versions of your Python code:
# Basic usage - compare two Python files
co-datascientist diff-pdf baseline.py improved.py
# Advanced usage with custom options
co-datascientist diff-pdf \
baseline.py \
optimized.py \
--output "optimization_report.pdf" \
--title "XOR Problem Optimization Results"
Options:
file1: Path to the baseline/original file (required)file2: Path to the modified/new file (required)--output, -o: Output PDF file path (auto-generated if not specified)--title, -t: Custom title for the diff report
Example workflow:
# 1. Run optimization
co-datascientist run --parallel 3 python xor.py
# 2. Plot the KPI progression
co-datascientist plot-kpi --checkpoints-dir co_datascientist_checkpoints --title "XOR Optimization"
# 3. Compare best result with baseline
co-datascientist diff-pdf xor.py co_datascientist_checkpoints/best_iteration_50.py --title "XOR Improvements"
These tools help you understand your optimization journey and create professional reports showing the improvements Co-DataScientist achieved.
Need help
We’d love to chat: oz.kilim@tropiflo.io
All set? Run your pipelines and track the results.
⚠️ Disclaimer: Co-DataScientist executes your scripts on your own machine. Make sure you trust the code you feed it!
Made by the Tropiflo team
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file co_datascientist-0.6.2.tar.gz.
File metadata
- Download URL: co_datascientist-0.6.2.tar.gz
- Upload date:
- Size: 396.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
426fbd266b61907d858c93b7ce9bf42754cdc8cf208b37af31c582d2d446e433
|
|
| MD5 |
d4314a8e55cc0de428d0dbfc88fc93c6
|
|
| BLAKE2b-256 |
bb04514c51c5b91286faa953cf61a9995cf830ffd27329edd9b9bd773a2a6e8d
|
File details
Details for the file co_datascientist-0.6.2-py3-none-any.whl.
File metadata
- Download URL: co_datascientist-0.6.2-py3-none-any.whl
- Upload date:
- Size: 77.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6108817392e8057efc4df4da301101823791b84208dd8378c586b76ba8a9e115
|
|
| MD5 |
c1b9741aba791be7b1c212df3706b7a2
|
|
| BLAKE2b-256 |
a4e60e775172fb763d9c8c0dedf7ec72b7010157fb1d26dc75b186e6b1e4df9f
|