Skip to main content

LLM-powered AutoML — model selection, tuning, and training via any LLM

Project description

ModelGPT

LLM-powered AutoML. Describe your dataset, pick a task — ModelGPT asks a local LLM to design, tune, and fit the best model for you.

from modelgpt import ModelGPT

mg = ModelGPT()
model = mg.fit(X_train, y_train, task="regression")
predictions = model.predict(X_test)

How it works

Your data  ──▶  Dataset summary  ──▶  LLM prompt
                                          │
                                          ▼
                                 LLM generates Python
                                 (ensemble + CV tuning)
                                          │
                                          ▼
                                   exec() in sandbox
                                          │
                             ┌────────────┴────────────┐
                             ▼                         ▼
                       Fitted model           Error? retry (→ LLM
                       returned               sees error, self-corrects)
                                                       │
                                              Max retries hit?
                                                       ▼
                                              Safe fallback model

ModelGPT feeds the LLM a structured dataset summary (shape, dtypes, missing values, target statistics, class distribution). The LLM returns raw Python that uses cross-validated hyperparameter search and an ensemble method. That code runs in a controlled namespace with X and y already bound. If it fails, the error is sent back to the LLM for self-correction — up to max_retries times.


Installation

# 1. Clone
git clone https://github.com/yourname/modelGPT.git
cd modelGPT

# 2. Install Python dependencies
pip install pandas scikit-learn xgboost lightgbm catboost ollama

# 3. Install and start Ollama  (https://ollama.com)
ollama pull qwen3-vl:235b-cloud       # or any model you prefer
ollama serve

Project structure

modelGPT/
├── modelgpt/
│   └── modelgpt.py          # Core ModelGPT class
└── example/
    └── example_usage.py     # Quickstart demo (diabetes dataset)

API reference

ModelGPT(llm_model, max_retries, verbose)

Parameter Type Default Description
llm_model str qwen3-vl:235b-cloud Ollama model tag to use
max_retries int 3 How many times to retry on code execution failure
verbose bool True Print generated code and progress to stdout

.fit(X, y, task, metric)

Parameter Type Default Description
X pd.DataFrame Feature matrix
y pd.Series Target vector
task str "regression" "regression" or "classification"
metric str | None None Metric hint passed to the LLM (e.g. "RMSE", "AUC")

Returns a fitted sklearn-compatible model with a .predict() method.


Choosing a model

ModelGPT works with any model available in Ollama. Larger models produce better code:

mg = ModelGPT(llm_model="qwen2.5-coder:32b")

Tips for better results

Tell the LLM which metric matters. The metric argument is forwarded directly into the prompt:

model = mg.fit(X_train, y_train, task="regression", metric="RMSE")
model = mg.fit(X_train, y_train, task="classification", metric="AUC")

Increase retries for hard tasks. If the LLM frequently generates broken code, raise max_retries:

mg = ModelGPT(max_retries=5)

Inspect what was generated. With verbose=True (the default) the full generated code is printed. You can copy it, tweak it, and re-run it manually.

Fallback is always safe. If every retry fails, ModelGPT automatically falls back to a well-tuned GradientBoostingRegressor / GradientBoostingClassifier, so .fit() never crashes your pipeline.


Requirements

  • Python ≥ 3.10
  • Ollama running locally (ollama serve)
  • pandas, scikit-learn, ollama
  • Optional but recommended: xgboost, lightgbm, catboost

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modelgpt-0.2.0.tar.gz (7.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

modelgpt-0.2.0-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file modelgpt-0.2.0.tar.gz.

File metadata

  • Download URL: modelgpt-0.2.0.tar.gz
  • Upload date:
  • Size: 7.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for modelgpt-0.2.0.tar.gz
Algorithm Hash digest
SHA256 c651e8d71152ad26344f7dd554ec21c2af8903902e4846986e021353830d5efd
MD5 7abf3fb5701a2226c47c0eed7b9553aa
BLAKE2b-256 86b5957a2952db92cf30c9f54090ca5246e0ea8c42ddbde58b5c508ab8c97383

See more details on using hashes here.

File details

Details for the file modelgpt-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: modelgpt-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.0

File hashes

Hashes for modelgpt-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 57094cf70ea2025d27fd1f3e93cd2fff4aa3e7616907ba356330a0180c463a52
MD5 b2343e828f9fb4610ce7810f7babfbd0
BLAKE2b-256 42a21c2aa3473b9a7245ce0c62fa6e40e061b9394458877657f0cc3685068c74

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page