bentoml

A framework for machine learning model serving

These details have not been verified by PyPI

Project links

Project description

The easiest way to build Machine Learning APIs

Multi-framework / High-performance / Easy to learn / Production ready

What does BentoML do?

Package models trained with any ML framework and reproduce them for model serving in production
Package once and deploy anywhere for real-time API serving or offline batch serving
High-Performance API model server with adaptive micro-batching support
Central storage hub with Web UI and APIs for managing and accessing packaged models
Modular and flexible design allowing advanced users to easily customize

BentoML is a framework for serving, managing and deploying machine learning models. It is aiming to bridge the gap between Data Science and DevOps, and enable data science teams to continuesly deliver prediction services to production.

👉 Join the community: BentoML Slack Channel and BentoML Discussions.

How BentoML works

BentoML provides abstractions for creating prediction service that's bundled with one or multiple trained models. User can define inference APIs with serving logic with Python code and specify the expected input/output data format:

import pandas as pd

from bentoml import env, artifacts, api, BentoService
from bentoml.adapters import DataframeInput
from bentoml.frameworks.sklearn import SklearnModelArtifact

from my_library import preprocess

@env(infer_pip_packages=True)
@artifacts([SklearnModelArtifact('my_model')])
class MyPredictionService(BentoService):
    """
    A minimum prediction service exposing a Scikit-learn model
    """

    @api(input=DataframeInput(orient="records"), batch=True)
    def predict(self, df: pd.DataFrame):
        """
        An inference API named `predict` with Dataframe input adapter, which codifies
        how HTTP requests or CSV files are converted to a pandas Dataframe object as the
        inference API function input
        """
        model_input = preprocess(df)
        return self.artifacts.my_model.predict(model_input)

At the end of your model training pipeline, import your BentoML prediction service class, pack it with your trained model, and persist the entire prediction service with save call at the end:

from my_prediction_service import MyPredictionService
svc = MyPredictionService()
svc.pack('my_model', my_sklearn_model)
svc.save()  # default saves to ~/bentoml/repository/MyPredictionService/{version}/

This will save all the code, files, serialized models, and configs required for reproducing this prediction service for inference. BentoML automatically find all the pip package dependencies and local python code dependencies, and make sure all those are packaged and versioned with your code and model in one place.

With the saved prediction service, a user can easily start a local API server hosting it:

bentoml serve MyPredictionService:latest

And create a docker container image for this API model server with just one command:

bentoml containerize my_prediction_service MyPredictionService:latest -t my_prediction_service

docker run -p 5000:5000 my_prediction_service

BentoML will make sure the container has all the required dependencies installed. In addition to the model inference API, this containerized BentoML model server also comes with instrumentations, metrics/healthcheck endpoints, prediction logging, tracing and it is thus ready for your DevOps team to deploy in production.

If you are at a small team without DevOps support, BentoML also provides an one-click deployment option, which deploys the model server API to cloud platforms with minimum setup.

Read the Quickstart Guide to learn more about the basic functionalities of BentoML. You can also try it out here on Google Colab.

Documentation

BentoML documentation: https://docs.bentoml.org/

Quickstart Guide, try it out on Google Colab
Core Concepts
API References
FAQ
Example projects: bentoml/Gallery

Kye Features

Online serving with API model server:

Containerized model server for production deployment with Docker, Kubernetes, OpenShift, AWS ECS, Azure, GCP GKE, etc
Adaptive micro-batching for optimal online serving performance
Discover and package all dependencies automatically, including PyPI, conda packages and local python modules
Support multiple ML frameworks including PyTorch, TensorFlow, Scikit-Learn, XGBoost, and many more
Serve compositions of multiple models
Serve multiple endpoints in one model server
Serve any Python code along with trained models
Automatically generate HTTP API spec in Swagger/OpenAPI format
Prediction logging and feedback logging endpoint
Health check endpoint and Prometheus /metrics endpoint for monitoring
Load and replay historical prediction request logs (roadmap)
Model serving via gRPC endpoint (roadmap)

Advanced workflow for model serving and deployment:

Central repository for managing all your team's packaged models via Web UI and API
Launch inference run from CLI or Python, which enables CI/CD testing, programmatic access and batch offline inference job
One-click deployment to cloud platforms including AWS Lambda, AWS SageMaker, and Azure Functions
Distributed batch job or streaming job with Apache Spark (improved Spark support is on roadmap)
Advanced model deployment workflows for Kubernetes, including auto-scaling, scale-to-zero, A/B testing, canary deployment, and multi-armed-bandit (roadmap)
Deep integration with ML experimentation platforms including MLFlow, Kubeflow (roadmap)

ML Frameworks

Scikit-Learn - Docs | Examples
PyTorch - Docs | Examples
Tensorflow 2 - Docs | Examples
Tensorflow Keras - Docs | Examples
XGBoost - Docs | Examples
LightGBM - Docs | Examples
FastText - Docs | Examples
FastAI - Docs | Examples
H2O - Docs | Examples
ONNX - Docs | Examples
Spacy - Docs | Examples
Statsmodels - Docs | Examples
CoreML - Docs
Transformers - Docs

Deployment Options

Be sure to check out deployment overview doc to understand which deployment option is best suited for your use case.

One-click deployment with BentoML:
Deploy with open-source platforms:
- Docker
- Kubernetes
- Knative
- Kubeflow
- KFServing
- Clipper
Deploy directly to cloud services:

Why BentoML

Moving trained Machine Learning models to serving applications in production is hard. It is a sequential process across data science, engineering and DevOps teams: after a model is trained by the data science team, they hand it over to the engineering team to refine and optimize code and creates an API, before DevOps can deploy.

And most importantly, Data Science teams want to continously repeat this process, monitor the models deployed in production and ship new models quickly. It often takes months for an engineering team to build a model serving & deployment solution that allow data science teams to ship new models in a repeatable and reliable way.

BentoML is a framework designed to solve this problem. It provides high-level APIs for Data Science team to create prediction services, abstract away DevOps' infrastructure needs and performance optimizations in the process. This allows DevOps team to seamlessly work with data science side-by-side, deploy and operate their models packaged in BentoML format in production.

Check out Frequently Asked Questions page on how does BentoML compares to Tensorflow-serving, Clipper, AWS SageMaker, MLFlow, etc.

Contributing

Have questions or feedback? Post a new github issue or discuss in our Slack channel:

Want to help build BentoML? Check out our contributing guide and the development guide.

Releases

BentoML is under active development and is evolving rapidly. Currently it is a Beta release, we may change APIs in future releases.

Read more about the latest features and changes in BentoML from the releases page.

Usage Tracking

BentoML by default collects anonymous usage data using Amplitude. It only collects BentoML library's own actions and parameters, no user or model data will be collected. Here is the code that does it.

This helps BentoML team to understand how the community is using this tool and what to build next. You can easily opt-out of usage tracking by running the following command:

# From terminal:
bentoml config set usage_tracking=false

# From python:
import bentoml
bentoml.config().set('core', 'usage_tracking', 'False')

License

Apache License 2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.4.36

Mar 6, 2026

1.4.35

Feb 3, 2026

1.4.34

Jan 26, 2026

1.4.33

Jan 12, 2026

1.4.32

Jan 9, 2026

1.4.31 yanked

Dec 23, 2025

1.4.30

Nov 27, 2025

1.4.29

Nov 17, 2025

1.4.28

Oct 29, 2025

1.4.27

Oct 20, 2025

1.4.26

Oct 10, 2025

1.4.25

Sep 24, 2025

1.4.24

Sep 17, 2025

1.4.23

Sep 5, 2025

1.4.22

Aug 27, 2025

1.4.21

Aug 14, 2025

1.4.20

Aug 13, 2025

1.4.19

Jul 29, 2025

1.4.18

Jul 24, 2025

1.4.17

Jun 30, 2025

1.4.16

Jun 16, 2025

1.4.15

May 23, 2025

1.4.14

May 20, 2025

1.4.13

May 9, 2025

1.4.12

Apr 29, 2025

1.4.11

Apr 22, 2025

1.4.10

Apr 17, 2025

1.4.9 yanked

Apr 17, 2025

Reason this release was yanked:

bad release

1.4.8

Apr 8, 2025

1.4.7

Mar 28, 2025

1.4.6

Mar 25, 2025

1.4.5

Mar 14, 2025

1.4.4

Mar 14, 2025

1.4.3

Mar 6, 2025

1.4.2

Feb 28, 2025

1.4.1

Feb 25, 2025

1.4.0

Feb 20, 2025

1.4.0a2 pre-release

Feb 14, 2025

1.4.0a1 pre-release

Feb 14, 2025

1.3.22

Feb 13, 2025

1.3.21

Feb 4, 2025

1.3.20

Jan 17, 2025

1.3.19

Jan 6, 2025

1.3.18

Dec 27, 2024

1.3.17

Dec 24, 2024

1.3.16

Dec 12, 2024

1.3.15

Dec 2, 2024

1.3.14

Nov 21, 2024

1.3.13

Nov 18, 2024

1.3.12

Nov 14, 2024

1.3.11

Nov 8, 2024

1.3.10

Oct 29, 2024

1.3.9

Oct 16, 2024

1.3.8

Oct 11, 2024

1.3.7

Sep 25, 2024

1.3.6

Sep 25, 2024

1.3.5

Sep 10, 2024

1.3.4.post1

Sep 6, 2024

1.3.3

Aug 23, 2024

1.3.2

Aug 14, 2024

1.3.1

Aug 2, 2024

1.3.0

Jul 19, 2024

1.3.0a3 pre-release

Jul 18, 2024

1.3.0a2 pre-release

Jul 16, 2024

1.3.0a1 pre-release

Jul 12, 2024

1.2.20

Jul 12, 2024

1.2.19

Jun 25, 2024

1.2.18

Jun 14, 2024

1.2.17

Jun 3, 2024

1.2.16

May 16, 2024

1.2.15

May 9, 2024

1.2.14

May 8, 2024

1.2.13

May 6, 2024

1.2.12

Apr 19, 2024

1.2.11

Apr 12, 2024

1.2.10

Apr 8, 2024

1.2.9

Mar 22, 2024

1.2.8

Mar 20, 2024

1.2.7

Mar 14, 2024

1.2.6

Mar 9, 2024

1.2.5

Mar 5, 2024

1.2.4

Feb 20, 2024

1.2.3

Feb 20, 2024

1.2.2

Feb 5, 2024

1.2.1

Feb 3, 2024

1.2.1a1 pre-release

Feb 3, 2024

1.2.0

Feb 2, 2024

1.2.0rc1 pre-release

Jan 31, 2024

1.2.0a7 pre-release

Jan 30, 2024

1.2.0a6 pre-release

Jan 26, 2024

1.2.0a5 pre-release

Jan 20, 2024

1.2.0a4 pre-release

Jan 19, 2024

1.2.0a3 pre-release

Jan 19, 2024

1.2.0a2 pre-release

Jan 18, 2024

1.2.0a1 pre-release

Jan 17, 2024

1.2.0a0 pre-release

Jan 9, 2024

1.1.11

Dec 28, 2023

1.1.10

Nov 20, 2023

1.1.9

Nov 7, 2023

1.1.8

Nov 3, 2023

1.1.7

Oct 12, 2023

1.1.6

Sep 8, 2023

1.1.5

Sep 1, 2023

1.1.4

Aug 29, 2023

1.1.3

Aug 24, 2023

1.1.2

Aug 22, 2023

1.1.1

Aug 1, 2023

1.1.0

Jul 24, 2023

1.0.25

Jul 20, 2023

1.0.24

Jul 19, 2023

1.0.23

Jun 29, 2023

1.0.22

Jun 12, 2023

1.0.21

Jun 6, 2023

1.0.20

May 9, 2023

1.0.19

Apr 26, 2023

1.0.18

Apr 14, 2023

1.0.17

Apr 6, 2023

1.0.16

Mar 14, 2023

1.0.15

Feb 15, 2023

1.0.14

Feb 8, 2023

1.0.13

Jan 19, 2023

1.0.12

Dec 8, 2022

1.0.11

Dec 7, 2022

1.0.10

Nov 9, 2022

1.0.9

Nov 9, 2022

1.0.8

Nov 1, 2022

1.0.7

Oct 3, 2022

1.0.6 yanked

Sep 27, 2022

Reason this release was yanked:

A critical module import issue has been introduced in version 1.0.6. Please see release note https://github.com/bentoml/BentoML/releases/tag/v1.0.7

1.0.5

Aug 30, 2022

1.0.4

Aug 26, 2022

1.0.3

Aug 8, 2022

1.0.2

Jul 29, 2022

1.0.0

Jul 13, 2022

1.0.0rc3 pre-release

Jul 1, 2022

1.0.0rc2 pre-release

Jun 22, 2022

1.0.0rc1 pre-release

Jun 8, 2022

1.0.0rc0 pre-release

May 30, 2022

1.0.0a7 pre-release

Apr 6, 2022

1.0.0a6 pre-release

Mar 7, 2022

1.0.0a5 pre-release

Mar 1, 2022

1.0.0a4 pre-release

Feb 15, 2022

1.0.0a3 pre-release

Jan 28, 2022

1.0.0a2 pre-release

Jan 20, 2022

1.0.0a1 pre-release

Dec 14, 2021

1.0.0.dev1 pre-release yanked

Dec 13, 2021

1.0.0.dev0 pre-release yanked

Dec 13, 2021

0.13.2

Jul 20, 2022

0.13.1

Jul 13, 2021

0.13.0

Jun 16, 2021

0.12.1

Apr 5, 2021

0.12.0

Mar 22, 2021

0.11.0

Jan 14, 2021

0.11.dev0 pre-release

Jan 14, 2021

0.10.1

Dec 10, 2020

0.10.0

Dec 7, 2020

0.9.2

Oct 17, 2020

0.9.1

Oct 1, 2020

0.9.0

Sep 25, 2020

This version

0.9.0rc0 pre-release

Sep 21, 2020

0.8.6

Aug 25, 2020

0.8.5

Aug 11, 2020

0.8.4

Aug 7, 2020

0.8.3

Jul 6, 2020

0.8.2 yanked

Jun 26, 2020

0.8.1

Jun 15, 2020

0.8.0 yanked

Jun 15, 2020

0.7.8

May 27, 2020

0.7.7

May 18, 2020

0.7.6

May 15, 2020

0.7.5

May 7, 2020

0.7.4

Apr 30, 2020

0.7.3

Apr 14, 2020

0.7.2

Apr 3, 2020

0.7.1

Apr 3, 2020

0.7.0

Apr 3, 2020

0.6.3

Mar 5, 2020

0.6.2

Feb 11, 2020

0.6.1

Jan 24, 2020

0.6.0

Jan 23, 2020

0.5.8

Jan 8, 2020

0.5.7

Jan 6, 2020

0.5.6

Dec 22, 2019

0.5.5

Dec 20, 2019

0.5.4

Dec 19, 2019

0.5.3

Nov 28, 2019

0.5.2

Nov 26, 2019

0.5.1

Nov 26, 2019

0.5.0

Nov 21, 2019

0.4.9

Nov 11, 2019

0.4.8

Oct 24, 2019

0.4.7

Oct 17, 2019

0.4.5

Oct 17, 2019

0.4.4

Oct 14, 2019

0.4.3

Oct 9, 2019

0.4.2

Sep 25, 2019

0.4.1

Sep 17, 2019

0.4.0

Sep 17, 2019

0.3.4

Aug 7, 2019

0.3.3

Aug 7, 2019

0.3.1

Jul 25, 2019

0.3.0

Jul 17, 2019

0.2.2

Jul 10, 2019

0.2.1

Jun 24, 2019

0.2.0

May 21, 2019

0.1.2

May 1, 2019

0.1.1

Apr 25, 2019

0.0.9

Apr 18, 2019

0.0.8.post1

Apr 12, 2019

0.0.8

Apr 10, 2019

0.0.7

Apr 4, 2019

0.0.7.dev0 pre-release

Apr 10, 2019

0.0.6a0 pre-release

Apr 2, 2019

0.0.5

Apr 2, 2019

0.0.3

Jan 16, 2019

0.0.2

Jan 16, 2019

0.0.1

Jan 15, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

BentoML-0.9.0rc0.tar.gz (2.7 MB view details)

Uploaded Sep 21, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

BentoML-0.9.0rc0-py3-none-any.whl (3.0 MB view details)

Uploaded Sep 21, 2020 Python 3

File details

Details for the file BentoML-0.9.0rc0.tar.gz.

File metadata

Download URL: BentoML-0.9.0rc0.tar.gz
Upload date: Sep 21, 2020
Size: 2.7 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/49.3.1.post20200810 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.7.8

File hashes

Hashes for BentoML-0.9.0rc0.tar.gz
Algorithm	Hash digest
SHA256	`739237223d5819a5d4b4abd071ae1f974512fe2aeb1a0bcf0d7a08b7c3d1baa9`
MD5	`6b3cf0d16ab5d13b29087e513bc3517c`
BLAKE2b-256	`2756fd1afb329513946fdffbaa780269371a2c6e1ec19b573c4d064793121912`

See more details on using hashes here.

File details

Details for the file BentoML-0.9.0rc0-py3-none-any.whl.

File metadata

Download URL: BentoML-0.9.0rc0-py3-none-any.whl
Upload date: Sep 21, 2020
Size: 3.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/49.3.1.post20200810 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.7.8

File hashes

Hashes for BentoML-0.9.0rc0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e2e952f20c3244ed0aabe98ef1f8e34ce0cd64e8319982074bda8db52fc05a84`
MD5	`e9bd651e08b8c8cb65437119fe13b788`
BLAKE2b-256	`12b53673fd7988c548513187dbc9dc1003214539a84300c165f22551e37a1f37`

See more details on using hashes here.

bentoml 0.9.0rc0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

The easiest way to build Machine Learning APIs

How BentoML works

Documentation

Kye Features

ML Frameworks

Deployment Options

Why BentoML

Contributing

Releases

Usage Tracking

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes