Skip to main content

An agentic framework for building ML models from natural language

Project description

plexe ✨

PyPI version Discord

backed-by-yc

Build machine learning models using natural language.

Quickstart | Features | Installation | Documentation


plexe lets you create machine learning models by describing them in plain language. Simply explain what you want, provide a dataset, and the AI-powered system builds a fully functional model through an automated agentic approach. Also available as a managed cloud service.


Watch the demo on YouTube: Building an ML model with Plexe

1. Quickstart

Installation

pip install plexe
export OPENAI_API_KEY=<your-key>
export ANTHROPIC_API_KEY=<your-key>

Using plexe

Provide a tabular dataset (Parquet, CSV, ORC, or Avro) and a natural language intent:

python -m plexe.main \
    --train-dataset-uri data.parquet \
    --intent "predict whether a passenger was transported" \
    --max-iterations 5
from plexe.main import main
from pathlib import Path

best_solution, metrics, report = main(
    intent="predict whether a passenger was transported",
    data_refs=["train.parquet"],
    max_iterations=5,
    work_dir=Path("./workdir"),
)
print(f"Performance: {best_solution.performance:.4f}")

2. Features

2.1. 🤖 Multi-Agent Architecture

The system uses 14 specialized AI agents across a 6-phase workflow to:

  • Analyze your data and identify the ML task
  • Select the right evaluation metric
  • Search for the best model through hypothesis-driven iteration
  • Evaluate model performance and robustness
  • Package the model for deployment

2.2. 🎯 Automated Model Building

Build complete models with a single call. Plexe supports XGBoost, CatBoost, LightGBM, Keras, and PyTorch for tabular data:

best_solution, metrics, report = main(
    intent="predict house prices based on property features",
    data_refs=["housing.parquet"],
    max_iterations=10,                    # Search iterations
    allowed_model_types=["xgboost"],      # Or let plexe choose
    enable_final_evaluation=True,         # Evaluate on held-out test set
)

Run python -m plexe.main --help for all CLI options.

The output is a self-contained model package at work_dir/model/ (also archived as model.tar.gz). The package has no dependency on plexe — build the model with plexe, deploy it anywhere:

model/
├── artifacts/          # Trained model + feature pipeline (pickle)
├── src/                # Inference predictor, pipeline code, training template
├── schemas/            # Input/output JSON schemas
├── config/             # Hyperparameters
├── evaluation/         # Metrics and detailed analysis reports
├── model.yaml          # Model metadata
└── README.md           # Usage instructions with example code

2.3. 🐳 Batteries-Included Docker Images

Run plexe with everything pre-configured — PySpark, Java, and all dependencies included. A Makefile is provided for common workflows:

make build          # Build the Docker image
make test-quick     # Fast sanity check (~1 iteration)
make run-titanic    # Run on Spaceship Titanic dataset

Or run directly:

docker run --rm \
    -e OPENAI_API_KEY=$OPENAI_API_KEY \
    -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
    -v $(pwd)/data:/data -v $(pwd)/workdir:/workdir \
    plexe:py3.12 python -m plexe.main \
        --train-dataset-uri /data/dataset.parquet \
        --intent "predict customer churn" \
        --work-dir /workdir \
        --spark-mode local

A config.yaml in the project root is automatically mounted. A Databricks Connect image is also available: docker build --target databricks .

2.4. ⚙️ YAML Configuration

Customize LLM routing, search parameters, Spark settings, and more via a config file:

# config.yaml
max_search_iterations: 5
allowed_model_types: [xgboost, catboost]
spark_driver_memory: "4g"
hypothesiser_llm: "openai/gpt-5-mini"
feature_processor_llm: "anthropic/claude-sonnet-4-5-20250929"
CONFIG_FILE=config.yaml python -m plexe.main ...

See config.yaml.template for all available options.

2.5. 🌐 Multi-Provider LLM Support

Plexe uses LLMs via LiteLLM, so you can use any supported provider:

# Route different agents to different providers
hypothesiser_llm: "openai/gpt-5-mini"
feature_processor_llm: "anthropic/claude-sonnet-4-5-20250929"
model_definer_llm: "ollama/llama3"

[!NOTE] Plexe should work with most LiteLLM providers, but we actively test only with openai/* and anthropic/* models. If you encounter issues with other providers, please let us know.

2.6. 📊 Experiment Dashboard

Visualize experiment results, search trees, and evaluation reports with the built-in Streamlit dashboard:

python -m plexe.viz --work-dir ./workdir

2.7. 🔌 Extensibility

Connect plexe to custom storage, tracking, and deployment infrastructure via the WorkflowIntegration interface:

main(intent="...", data_refs=[...], integration=MyCustomIntegration())

See plexe/integrations/base.py for the full interface.

3. Installation

3.1. Installation Options

pip install plexe                    # Core (XGBoost, Keras, scikit-learn)

You can add optional dependencies either by framework or by task grouping:

  • Framework extras: catboost, lightgbm, pytorch
  • Task extras: tabular (CatBoost + LightGBM), vision (PyTorch)
  • Platform extras: pyspark, aws

Examples:

pip install "plexe[tabular,pyspark]"   # tabular stack + local PySpark
pip install "plexe[pytorch,aws]"       # explicit framework + S3 support

Requires Python >= 3.10, < 3.13.

3.2. API Keys

export OPENAI_API_KEY=<your-key>
export ANTHROPIC_API_KEY=<your-key>

See LiteLLM providers for all supported providers.

4. Documentation

For full documentation, visit docs.plexe.ai.

5. Contributing

See CONTRIBUTING.md for guidelines. Join our Discord to connect with the team.

6. License

Apache-2.0 License

7. Citation

If you use Plexe in your research, please cite it as follows:

@software{plexe2025,
  author = {De Bernardi, Marcello AND Dubey, Vaibhav},
  title = {Plexe: Build machine learning models using natural language.},
  year = {2025},
  publisher = {GitHub},
  howpublished = {\url{https://github.com/plexe-ai/plexe}},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plexe-1.3.5.tar.gz (179.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

plexe-1.3.5-py3-none-any.whl (230.4 kB view details)

Uploaded Python 3

File details

Details for the file plexe-1.3.5.tar.gz.

File metadata

  • Download URL: plexe-1.3.5.tar.gz
  • Upload date:
  • Size: 179.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.12 Linux/6.14.0-1017-azure

File hashes

Hashes for plexe-1.3.5.tar.gz
Algorithm Hash digest
SHA256 4ee804a12c699ce6e5a753464af2738bb04706981b807f72ec057a5fcc90f3ef
MD5 a655d44f9a671fb87cc4ab7aa1a5c106
BLAKE2b-256 2dbf55408703a971aba5536c88689ea1b143b7920add8c54f79db0587750872f

See more details on using hashes here.

File details

Details for the file plexe-1.3.5-py3-none-any.whl.

File metadata

  • Download URL: plexe-1.3.5-py3-none-any.whl
  • Upload date:
  • Size: 230.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.12 Linux/6.14.0-1017-azure

File hashes

Hashes for plexe-1.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 2bcd7ddc78a7165965ecd9c682d57a503a7fc35d984df5a60b46c4e2b2030927
MD5 98b9e42c1deb5868dc986d856bc18a2b
BLAKE2b-256 2dbae996fbb958ab6b130cf9d75ad747996ad201daeeda24c4e9b1871b767c97

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page