bentoml

BentoML: The easiest way to serve AI apps and models

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

aar0npham frostming parano ssheng

These details have not been verified by PyPI

Project links

Project description

Unified Model Serving Framework

🍱 Build model inference APIs and multi-model serving systems with any open-source or custom AI models. 👉 Join our forum!

What is BentoML?

BentoML is a Python library for building online serving systems optimized for AI apps and model inference.

🍱 Easily build APIs for Any AI/ML Model. Turn any model inference script into a REST API server with just a few lines of code and standard Python type hints.
🐳 Docker Containers made simple. No more dependency hell! Manage your environments, dependencies and model versions with a simple config file. BentoML automatically generates Docker images, ensures reproducibility, and simplifies how you deploy to different environments.
🧭 Maximize CPU/GPU utilization. Build high performance inference APIs leveraging built-in serving optimization features like dynamic batching, model parallelism, multi-stage pipeline and multi-model inference-graph orchestration.
👩‍💻 Fully customizable. Easily implement your own APIs or task queues, with custom business logic, model inference and multi-model composition. Supports any ML framework, modality, and inference runtime.
🚀 Ready for Production. Develop, run and debug locally. Seamlessly deploy to production with Docker containers or BentoCloud.

Getting started

Install BentoML:

# Requires Python≥3.9
pip install -U bentoml

Define APIs in a service.py file.

import bentoml

@bentoml.service(
    image=bentoml.images.Image(python_version="3.11").python_packages("torch", "transformers"),
)
class Summarization:
    def __init__(self) -> None:
        import torch
        from transformers import pipeline

        device = "cuda" if torch.cuda.is_available() else "cpu"
        self.pipeline = pipeline('summarization', device=device)

    @bentoml.api(batchable=True)
    def summarize(self, texts: list[str]) -> list[str]:
        results = self.pipeline(texts)
        return [item['summary_text'] for item in results]

💻 Run locally

Install PyTorch and Transformers packages to your Python virtual environment.

pip install torch transformers  # additional dependencies for local run

Run the service code locally (serving at http://localhost:3000 by default):

bentoml serve

You should expect to see the following output.

[INFO] [cli] Starting production HTTP BentoServer from "service:Summarization" listening on http://localhost:3000 (Press CTRL+C to quit)
[INFO] [entry_service:Summarization:1] Service Summarization initialized

Now you can run inference from your browser at http://localhost:3000 or with a Python script:

import bentoml

with bentoml.SyncHTTPClient('http://localhost:3000') as client:
    summarized_text: str = client.summarize([bentoml.__doc__])[0]
    print(f"Result: {summarized_text}")

🐳 Deploy using Docker

Run bentoml build to package necessary code, models, dependency configs into a Bento - the standardized deployable artifact in BentoML:

bentoml build

Ensure Docker is running. Generate a Docker container image for deployment:

bentoml containerize summarization:latest

Run the generated image:

docker run --rm -p 3000:3000 summarization:latest

☁️ Deploy on BentoCloud

BentoCloud provides compute infrastructure for rapid and reliable GenAI adoption. It helps speed up your BentoML development process leveraging cloud compute resources, and simplify how you deploy, scale and operate BentoML in production.

# After signup, run the following command to create an API token:
bentoml cloud login

# Deploy from current directory:
bentoml deploy

bentocloud-ui

For detailed explanations, read the Hello World example.

Examples

LLMs: Llama 3.2, Mistral, DeepSeek Distil, and more.
Image Generation: Stable Diffusion 3 Medium, Stable Video Diffusion, Stable Diffusion XL Turbo, ControlNet, and LCM LoRAs.
Embeddings: SentenceTransformers and ColPali
Audio: ChatTTS, XTTS, WhisperX, Bark
Computer Vision: YOLO and ResNet
Advanced examples: Function calling, LangGraph, CrewAI

Check out the full list for more sample code and usage.

Advanced topics

See Documentation for more tutorials and guides.

Community

Get involved and join our Community Forum 💬, where thousands of AI/ML engineers help each other, contribute to the project, and talk about building AI products.

To report a bug or suggest a feature request, use GitHub Issues.

Contributing

There are many ways to contribute to the project:

Report bugs and "Thumbs up" on issues that are relevant to you.
Investigate issues and review other developers' pull requests.
Contribute code or documentation to the project by submitting a GitHub pull request.
Check out the Contributing Guide and Development Guide to learn more.
Share your feedback and discuss roadmap plans in our forum.

Thanks to all of our amazing contributors!

Usage tracking and feedback

The BentoML framework collects anonymous usage data that helps our community improve the product. Only BentoML's internal API calls are being reported. This excludes any sensitive information, such as user code, model data, model names, or stack traces. Here's the code used for usage tracking. You can opt-out of usage tracking by the --do-not-track CLI option:

bentoml [command] --do-not-track

Or by setting the environment variable:

export BENTOML_DO_NOT_TRACK=True

License

Apache License 2.0

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

aar0npham frostming parano ssheng

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.4.38

Apr 2, 2026

This version

1.4.37

Mar 25, 2026

1.4.36

Mar 6, 2026

1.4.35

Feb 3, 2026

1.4.34

Jan 26, 2026

1.4.33

Jan 12, 2026

1.4.32

Jan 9, 2026

1.4.31 yanked

Dec 23, 2025

1.4.30

Nov 27, 2025

1.4.29

Nov 17, 2025

1.4.28

Oct 29, 2025

1.4.27

Oct 20, 2025

1.4.26

Oct 10, 2025

1.4.25

Sep 24, 2025

1.4.24

Sep 17, 2025

1.4.23

Sep 5, 2025

1.4.22

Aug 27, 2025

1.4.21

Aug 14, 2025

1.4.20

Aug 13, 2025

1.4.19

Jul 29, 2025

1.4.18

Jul 24, 2025

1.4.17

Jun 30, 2025

1.4.16

Jun 16, 2025

1.4.15

May 23, 2025

1.4.14

May 20, 2025

1.4.13

May 9, 2025

1.4.12

Apr 29, 2025

1.4.11

Apr 22, 2025

1.4.10

Apr 17, 2025

1.4.9 yanked

Apr 17, 2025

Reason this release was yanked:

bad release

1.4.8

Apr 8, 2025

1.4.7

Mar 28, 2025

1.4.6

Mar 25, 2025

1.4.5

Mar 14, 2025

1.4.4

Mar 14, 2025

1.4.3

Mar 6, 2025

1.4.2

Feb 28, 2025

1.4.1

Feb 25, 2025

1.4.0

Feb 20, 2025

1.4.0a2 pre-release

Feb 14, 2025

1.4.0a1 pre-release

Feb 14, 2025

1.3.22

Feb 13, 2025

1.3.21

Feb 4, 2025

1.3.20

Jan 17, 2025

1.3.19

Jan 6, 2025

1.3.18

Dec 27, 2024

1.3.17

Dec 24, 2024

1.3.16

Dec 12, 2024

1.3.15

Dec 2, 2024

1.3.14

Nov 21, 2024

1.3.13

Nov 18, 2024

1.3.12

Nov 14, 2024

1.3.11

Nov 8, 2024

1.3.10

Oct 29, 2024

1.3.9

Oct 16, 2024

1.3.8

Oct 11, 2024

1.3.7

Sep 25, 2024

1.3.6

Sep 25, 2024

1.3.5

Sep 10, 2024

1.3.4.post1

Sep 6, 2024

1.3.3

Aug 23, 2024

1.3.2

Aug 14, 2024

1.3.1

Aug 2, 2024

1.3.0

Jul 19, 2024

1.3.0a3 pre-release

Jul 18, 2024

1.3.0a2 pre-release

Jul 16, 2024

1.3.0a1 pre-release

Jul 12, 2024

1.2.20

Jul 12, 2024

1.2.19

Jun 25, 2024

1.2.18

Jun 14, 2024

1.2.17

Jun 3, 2024

1.2.16

May 16, 2024

1.2.15

May 9, 2024

1.2.14

May 8, 2024

1.2.13

May 6, 2024

1.2.12

Apr 19, 2024

1.2.11

Apr 12, 2024

1.2.10

Apr 8, 2024

1.2.9

Mar 22, 2024

1.2.8

Mar 20, 2024

1.2.7

Mar 14, 2024

1.2.6

Mar 9, 2024

1.2.5

Mar 5, 2024

1.2.4

Feb 20, 2024

1.2.3

Feb 20, 2024

1.2.2

Feb 5, 2024

1.2.1

Feb 3, 2024

1.2.1a1 pre-release

Feb 3, 2024

1.2.0

Feb 2, 2024

1.2.0rc1 pre-release

Jan 31, 2024

1.2.0a7 pre-release

Jan 30, 2024

1.2.0a6 pre-release

Jan 26, 2024

1.2.0a5 pre-release

Jan 20, 2024

1.2.0a4 pre-release

Jan 19, 2024

1.2.0a3 pre-release

Jan 19, 2024

1.2.0a2 pre-release

Jan 18, 2024

1.2.0a1 pre-release

Jan 17, 2024

1.2.0a0 pre-release

Jan 9, 2024

1.1.11

Dec 28, 2023

1.1.10

Nov 20, 2023

1.1.9

Nov 7, 2023

1.1.8

Nov 3, 2023

1.1.7

Oct 12, 2023

1.1.6

Sep 8, 2023

1.1.5

Sep 1, 2023

1.1.4

Aug 29, 2023

1.1.3

Aug 24, 2023

1.1.2

Aug 22, 2023

1.1.1

Aug 1, 2023

1.1.0

Jul 24, 2023

1.0.25

Jul 20, 2023

1.0.24

Jul 19, 2023

1.0.23

Jun 29, 2023

1.0.22

Jun 12, 2023

1.0.21

Jun 6, 2023

1.0.20

May 9, 2023

1.0.19

Apr 26, 2023

1.0.18

Apr 14, 2023

1.0.17

Apr 6, 2023

1.0.16

Mar 14, 2023

1.0.15

Feb 15, 2023

1.0.14

Feb 8, 2023

1.0.13

Jan 19, 2023

1.0.12

Dec 8, 2022

1.0.11

Dec 7, 2022

1.0.10

Nov 9, 2022

1.0.9

Nov 9, 2022

1.0.8

Nov 1, 2022

1.0.7

Oct 3, 2022

1.0.6 yanked

Sep 27, 2022

Reason this release was yanked:

A critical module import issue has been introduced in version 1.0.6. Please see release note https://github.com/bentoml/BentoML/releases/tag/v1.0.7

1.0.5

Aug 30, 2022

1.0.4

Aug 26, 2022

1.0.3

Aug 8, 2022

1.0.2

Jul 29, 2022

1.0.0

Jul 13, 2022

1.0.0rc3 pre-release

Jul 1, 2022

1.0.0rc2 pre-release

Jun 22, 2022

1.0.0rc1 pre-release

Jun 8, 2022

1.0.0rc0 pre-release

May 30, 2022

1.0.0a7 pre-release

Apr 6, 2022

1.0.0a6 pre-release

Mar 7, 2022

1.0.0a5 pre-release

Mar 1, 2022

1.0.0a4 pre-release

Feb 15, 2022

1.0.0a3 pre-release

Jan 28, 2022

1.0.0a2 pre-release

Jan 20, 2022

1.0.0a1 pre-release

Dec 14, 2021

1.0.0.dev1 pre-release yanked

Dec 13, 2021

1.0.0.dev0 pre-release yanked

Dec 13, 2021

0.13.2

Jul 20, 2022

0.13.1

Jul 13, 2021

0.13.0

Jun 16, 2021

0.12.1

Apr 5, 2021

0.12.0

Mar 22, 2021

0.11.0

Jan 14, 2021

0.11.dev0 pre-release

Jan 14, 2021

0.10.1

Dec 10, 2020

0.10.0

Dec 7, 2020

0.9.2

Oct 17, 2020

0.9.1

Oct 1, 2020

0.9.0

Sep 25, 2020

0.9.0rc0 pre-release

Sep 21, 2020

0.8.6

Aug 25, 2020

0.8.5

Aug 11, 2020

0.8.4

Aug 7, 2020

0.8.3

Jul 6, 2020

0.8.2 yanked

Jun 26, 2020

0.8.1

Jun 15, 2020

0.8.0 yanked

Jun 15, 2020

0.7.8

May 27, 2020

0.7.7

May 18, 2020

0.7.6

May 15, 2020

0.7.5

May 7, 2020

0.7.4

Apr 30, 2020

0.7.3

Apr 14, 2020

0.7.2

Apr 3, 2020

0.7.1

Apr 3, 2020

0.7.0

Apr 3, 2020

0.6.3

Mar 5, 2020

0.6.2

Feb 11, 2020

0.6.1

Jan 24, 2020

0.6.0

Jan 23, 2020

0.5.8

Jan 8, 2020

0.5.7

Jan 6, 2020

0.5.6

Dec 22, 2019

0.5.5

Dec 20, 2019

0.5.4

Dec 19, 2019

0.5.3

Nov 28, 2019

0.5.2

Nov 26, 2019

0.5.1

Nov 26, 2019

0.5.0

Nov 21, 2019

0.4.9

Nov 11, 2019

0.4.8

Oct 24, 2019

0.4.7

Oct 17, 2019

0.4.5

Oct 17, 2019

0.4.4

Oct 14, 2019

0.4.3

Oct 9, 2019

0.4.2

Sep 25, 2019

0.4.1

Sep 17, 2019

0.4.0

Sep 17, 2019

0.3.4

Aug 7, 2019

0.3.3

Aug 7, 2019

0.3.1

Jul 25, 2019

0.3.0

Jul 17, 2019

0.2.2

Jul 10, 2019

0.2.1

Jun 24, 2019

0.2.0

May 21, 2019

0.1.2

May 1, 2019

0.1.1

Apr 25, 2019

0.0.9

Apr 18, 2019

0.0.8.post1

Apr 12, 2019

0.0.8

Apr 10, 2019

0.0.7

Apr 4, 2019

0.0.7.dev0 pre-release

Apr 10, 2019

0.0.6a0 pre-release

Apr 2, 2019

0.0.5

Apr 2, 2019

0.0.3

Jan 16, 2019

0.0.2

Jan 16, 2019

0.0.1

Jan 15, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bentoml-1.4.37.tar.gz (987.7 kB view details)

Uploaded Mar 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

bentoml-1.4.37-py3-none-any.whl (1.2 MB view details)

Uploaded Mar 25, 2026 Python 3

File details

Details for the file bentoml-1.4.37.tar.gz.

File metadata

Download URL: bentoml-1.4.37.tar.gz
Upload date: Mar 25, 2026
Size: 987.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bentoml-1.4.37.tar.gz
Algorithm	Hash digest
SHA256	`179fb9aa66d9ce51093fc6ef5eaeba082f904b72804b512da8f3f8b9ced96223`
MD5	`263d3f57dd58496facd359e3824e6a45`
BLAKE2b-256	`5077e15a9f48a07339b6b47e7bfc0f7e9a7f6fb19091502ff460fba70e5504fe`

See more details on using hashes here.

Provenance

The following attestation bundles were made for bentoml-1.4.37.tar.gz:

Publisher: release.yml on bentoml/BentoML

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: bentoml-1.4.37.tar.gz
- Subject digest: 179fb9aa66d9ce51093fc6ef5eaeba082f904b72804b512da8f3f8b9ced96223
- Sigstore transparency entry: 1177622088
- Sigstore integration time: Mar 25, 2026
Source repository:
- Permalink: bentoml/BentoML@0772581584e76d3bd8211841fbb7c856fb350043
- Branch / Tag: refs/tags/v1.4.37
- Owner: https://github.com/bentoml
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@0772581584e76d3bd8211841fbb7c856fb350043
- Trigger Event: push

File details

Details for the file bentoml-1.4.37-py3-none-any.whl.

File metadata

Download URL: bentoml-1.4.37-py3-none-any.whl
Upload date: Mar 25, 2026
Size: 1.2 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bentoml-1.4.37-py3-none-any.whl
Algorithm	Hash digest
SHA256	`34232cc01fe37dede70ddd3987d6ac537a1a53c308da2e111529a36686203e15`
MD5	`cce6f3714fd41d84b3bf8efebf521d12`
BLAKE2b-256	`a46998bddd4b228330f15f6e611dad1566aad7387efb2c9912290a7d90cd7f75`

See more details on using hashes here.

Provenance

The following attestation bundles were made for bentoml-1.4.37-py3-none-any.whl:

Publisher: release.yml on bentoml/BentoML

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: bentoml-1.4.37-py3-none-any.whl
- Subject digest: 34232cc01fe37dede70ddd3987d6ac537a1a53c308da2e111529a36686203e15
- Sigstore transparency entry: 1177622126
- Sigstore integration time: Mar 25, 2026
Source repository:
- Permalink: bentoml/BentoML@0772581584e76d3bd8211841fbb7c856fb350043
- Branch / Tag: refs/tags/v1.4.37
- Owner: https://github.com/bentoml
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@0772581584e76d3bd8211841fbb7c856fb350043
- Trigger Event: push

bentoml 1.4.37

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Unified Model Serving Framework

What is BentoML?

Getting started

💻 Run locally

🐳 Deploy using Docker

☁️ Deploy on BentoCloud

Examples

Advanced topics

Community

Contributing

Usage tracking and feedback

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance