Skip to main content

An agentic framework for building ML models from natural language

Project description

plexe ✨

PyPI version Discord

backed-by-yc

Build machine learning models using natural language.

Quickstart | Features | Installation | Documentation


plexe lets you create machine learning models by describing them in plain language. Simply explain what you want, provide a dataset, and the AI-powered system builds a fully functional model through an automated agentic approach. Also available as a managed cloud service.


Watch the demo on YouTube: Building an ML model with Plexe

1. Quickstart

Installation

pip install plexe
export OPENAI_API_KEY=<your-key>
export ANTHROPIC_API_KEY=<your-key>

Using plexe

Provide a tabular dataset (Parquet, CSV, ORC, or Avro) and a natural language intent:

python -m plexe.main \
    --train-dataset-uri data.parquet \
    --intent "predict whether a passenger was transported" \
    --max-iterations 5
from plexe.main import main
from pathlib import Path

best_solution, metrics, report = main(
    intent="predict whether a passenger was transported",
    data_refs=["train.parquet"],
    max_iterations=5,
    work_dir=Path("./workdir"),
)
print(f"Performance: {best_solution.performance:.4f}")

2. Features

2.1. 🤖 Multi-Agent Architecture

The system uses 14 specialized AI agents across a 6-phase workflow to:

  • Analyze your data and identify the ML task
  • Select the right evaluation metric
  • Search for the best model through hypothesis-driven iteration
  • Evaluate model performance and robustness
  • Package the model for deployment

2.2. 🎯 Automated Model Building

Build complete models with a single call. Plexe supports XGBoost, CatBoost, and Keras for tabular data:

best_solution, metrics, report = main(
    intent="predict house prices based on property features",
    data_refs=["housing.parquet"],
    max_iterations=10,                    # Search iterations
    allowed_model_types=["xgboost"],      # Or let plexe choose
    enable_final_evaluation=True,         # Evaluate on held-out test set
)

Run python -m plexe.main --help for all CLI options.

The output is a self-contained model package at work_dir/model/ (also archived as model.tar.gz). The package has no dependency on plexe — build the model with plexe, deploy it anywhere:

model/
├── artifacts/          # Trained model + feature pipeline (pickle)
├── src/                # Inference predictor, pipeline code, training template
├── schemas/            # Input/output JSON schemas
├── config/             # Hyperparameters
├── evaluation/         # Metrics and detailed analysis reports
├── model.yaml          # Model metadata
└── README.md           # Usage instructions with example code

2.3. 🐳 Batteries-Included Docker Images

Run plexe with everything pre-configured — PySpark, Java, and all dependencies included. A Makefile is provided for common workflows:

make build          # Build the Docker image
make test-quick     # Fast sanity check (~1 iteration)
make run-titanic    # Run on Spaceship Titanic dataset

Or run directly:

docker run --rm \
    -e OPENAI_API_KEY=$OPENAI_API_KEY \
    -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
    -v $(pwd)/data:/data -v $(pwd)/workdir:/workdir \
    plexe:py3.12 python -m plexe.main \
        --train-dataset-uri /data/dataset.parquet \
        --intent "predict customer churn" \
        --work-dir /workdir \
        --spark-mode local

A config.yaml in the project root is automatically mounted. A Databricks Connect image is also available: docker build --target databricks .

2.4. ⚙️ YAML Configuration

Customize LLM routing, search parameters, Spark settings, and more via a config file:

# config.yaml
max_search_iterations: 5
allowed_model_types: [xgboost, catboost]
spark_driver_memory: "4g"
hypothesiser_llm: "openai/gpt-5-mini"
feature_processor_llm: "anthropic/claude-sonnet-4-5-20250929"
CONFIG_FILE=config.yaml python -m plexe.main ...

See config.yaml.template for all available options.

2.5. 🌐 Multi-Provider LLM Support

Plexe uses LLMs via LiteLLM, so you can use any supported provider:

# Route different agents to different providers
hypothesiser_llm: "openai/gpt-5-mini"
feature_processor_llm: "anthropic/claude-sonnet-4-5-20250929"
model_definer_llm: "ollama/llama3"

[!NOTE] Plexe should work with most LiteLLM providers, but we actively test only with openai/* and anthropic/* models. If you encounter issues with other providers, please let us know.

2.6. 📊 Experiment Dashboard

Visualize experiment results, search trees, and evaluation reports with the built-in Streamlit dashboard:

python -m plexe.viz --work-dir ./workdir

2.7. 🔌 Extensibility

Connect plexe to custom storage, tracking, and deployment infrastructure via the WorkflowIntegration interface:

main(intent="...", data_refs=[...], integration=MyCustomIntegration())

See plexe/integrations/base.py for the full interface.

3. Installation

3.1. Installation Options

pip install plexe                    # Core (XGBoost, CatBoost, Keras, scikit-learn)
pip install plexe[pyspark]           # + Local PySpark execution
pip install plexe[aws]               # + S3 storage support (boto3)

Requires Python >= 3.10, < 3.13.

3.2. API Keys

export OPENAI_API_KEY=<your-key>
export ANTHROPIC_API_KEY=<your-key>

See LiteLLM providers for all supported providers.

4. Documentation

For full documentation, visit docs.plexe.ai.

5. Contributing

See CONTRIBUTING.md for guidelines. Join our Discord to connect with the team.

6. License

Apache-2.0 License

7. Citation

If you use Plexe in your research, please cite it as follows:

@software{plexe2025,
  author = {De Bernardi, Marcello AND Dubey, Vaibhav},
  title = {Plexe: Build machine learning models using natural language.},
  year = {2025},
  publisher = {GitHub},
  howpublished = {\url{https://github.com/plexe-ai/plexe}},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plexe-1.0.1.tar.gz (165.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

plexe-1.0.1-py3-none-any.whl (210.0 kB view details)

Uploaded Python 3

File details

Details for the file plexe-1.0.1.tar.gz.

File metadata

  • Download URL: plexe-1.0.1.tar.gz
  • Upload date:
  • Size: 165.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.12 Linux/6.11.0-1018-azure

File hashes

Hashes for plexe-1.0.1.tar.gz
Algorithm Hash digest
SHA256 0f6b2f7bc46dc8e39c60b5714df963b0f252c5af027264e412153ee89054939a
MD5 cb86ac5b8af727815ed26f5f6a7ac984
BLAKE2b-256 71d8a452e8256a8c8ccc8ad9e2ed1ef059f2f5ac1ed5d2cb9b5190cb04752494

See more details on using hashes here.

File details

Details for the file plexe-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: plexe-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 210.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.12.12 Linux/6.11.0-1018-azure

File hashes

Hashes for plexe-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a6f231edbd8c34de59ea8dd1121a49ea857dcc14c0a12fc07647505d5e83bbc4
MD5 033bb03bd74c56e1260620be622a9598
BLAKE2b-256 34be95847e7809b14b6e34dcc34c4e95e54877c42dcef23685ec12f533d617e1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page