Why BentoML?

In the last session, we discussed about MLflow on how it used by Data Scientists as playground to test different models and lets say once final model is ready, we need to use this model in our application and it needs to be deployed

we will deploy this ML Model as a service using BentoML application here, using BentoML, we can train model, save model, create service for model and deploy model

you can integrate your Model to API, LLM or any application which makes things easier for us once our model is ready

we will understand bentoML with a demo here

we will be using same kind of Model which we used in our last MLflow session - watch here

short overview of previous session:

we have taken an example of Diary Products and we want to create a model that has last three years data of products like milk, cheese, curd and using the data we want to predict next year sales and to experiment on creating model, we used mlflow

we will be using BentoML now to deploy model that has been finalised.

Start with BentoML

To start with BentoML on deploying basic model, you can start with official documentation to understand its working - BentoML hello world

Now you can use VS studio code or Jupyter Notebook, based on your prefernce

make sure you have python3 installed

open VS studio code or terminal of your choice

git clone https://github.com/bentoml/quickstart.git
cd quickstart

now follow the commands activating virtual environment

python3 -m venv quickstart
source quickstart/bin/activate

once python virtual environment is activated, let us use install bentoml and required dependencies

pip install bentoml torch transformers

Now you will be able to see service.py

this service.py calls models from hugging face and from code, the job of this model is to summarise the input text that is provided

and to deploy model and start BentoML, we will use

bentoml serve

this command by default searches for service.py file, in case service.py file is not available, it is important to specify path of file name as an additional suffix to command

This initiates our application and you can see it is running in our localhost at port 3000

and once you access your localhost

you will be able to find model service "summarize" is deployed, now click on "Try it out"

you can change example task to any paragraph of your choice and ask model to summarize

click on execute to see results

Now you have your answer in response body

and since we are in virtual environment, to deactivate

deactivate

and if you want to delete venv, delete folder it created

rm -rf quickstart

since, we are now familiar with how BentoML can be used, let us deploy our own model here.

BentoML Demo with Diary Farm Project

Now let us go in a step by step process to deploy our model using BentoML

let us assume that our Data scientist has finalised a model that has trained with data of last three years and three products like curd, cheese, milk

step1: create and save model

open new folder and name it as bentoML or anything you prefer

open folder in visual studio code

now let us create our first file model_train.py and place following text there

import pandas as pd
import bentoml
from sklearn.linear_model import LinearRegression

# ----------------------------
# Generate dummy sales data
# ----------------------------
def generate_sales_data():
    records = []
    for year in [2023, 2024, 2025]:
        for month in range(1, 13):
            records.append({
                "year": year,
                "month": month,
                "milk_sales": 200 + (year - 2023) * 40 + month * 5,
                "curd_sales": 150 + (year - 2023) * 30 + month * 4,
                "cheese_sales": 100 + (year - 2023) * 25 + month * 3,
            })
    return pd.DataFrame(records)

df = generate_sales_data()
X = df[["year", "month"]]

# ----------------------------
# Train models
# ----------------------------
models = {
    "milk_sales_model": df["milk_sales"],
    "curd_sales_model": df["curd_sales"],
    "cheese_sales_model": df["cheese_sales"],
}

for model_name, y in models.items():
    model = LinearRegression()
    model.fit(X, y)

    bentoml.sklearn.save_model(
        model_name,
        model,
        signatures={"predict": {"batchable": True}},
        metadata={"trained_years": "2023-2025"}
    )

    print(f"Saved model for {model_name}")

from code summary, we are creating random data for last three years sales for milk,curd, cheese and using data we are creating models for curd, milk, cheese individually using linear regression method

and now before we create models, let us activate virtual environment like we did in our demo before

python3 -m venv bentoml_live
source bentoml_live/bin/activate

let us install bentoml all required dependencies for this project

pip3 install bentoml scikit-learn pandas streamlit requests

now to save the model

python3 model_train.py

As you could see, it created three models

and now to check models, you can use command

bentoml models list

and everytime you save a model, bentoml saves each time with tag associated to it by default

also you can delete any model if it is not required

bentoml models list

#make sure to replace tags and model you want to delete

bentoml models delete milk_sales_model:yc5b7lhdjsg2irtw

Now since our Model is ready we need to deploy the model

step2: Deploy model

To deploy model we need bentoml service for which we will be creating new python file

save the below as model_service.py file

from __future__ import annotations
import bentoml
from pydantic import BaseModel
from typing import Literal

# ----------------------------
# Input schema
# ----------------------------
class SalesForecastInput(BaseModel):
    year: int
    month: str
    product: Literal["milk", "curd", "cheese"]

# ----------------------------
# Month mapping
# ----------------------------
MONTH_MAP = {
    "jan": 1, "feb": 2, "mar": 3, "apr": 4, "may": 5, "jun": 6,
    "jul": 7, "aug": 8, "sep": 9, "oct": 10, "nov": 11, "dec": 12
}

# ----------------------------
# BentoML Service
# ----------------------------
@bentoml.service
class DairySalesForecaster:

    def __init__(self) -> None:
        self.models = {
            "milk": bentoml.sklearn.load_model("milk_sales_model"),
            "curd": bentoml.sklearn.load_model("curd_sales_model"),
            "cheese": bentoml.sklearn.load_model("cheese_sales_model"),
        }

    @bentoml.api
    def forecast(self, input_data: SalesForecastInput):
        month_num = MONTH_MAP[input_data.month.lower()]
        features = [[input_data.year, month_num]]

        model = self.models[input_data.product]
        prediction = model.predict(features)[0]

        return {
            "year": input_data.year,
            "month": input_data.month,
            "product": input_data.product,
            "predicted_sales": round(float(prediction), 2)
        }

Here service file , we use decorators bentoml.service , we have loaded three models, and model is selected based on product choosen

like if milk is selected, it chooses milk_sales_model and it waits for user inputs like year, month, product and based on that it gives predicted sales output

lets say, you want to forecast sales data of curd for march 2026, it gives you predicted sales data based on data it is trained for last three years

and now to activate this bentoml service

bentoml serve model_service.py --reload

As you could see, our service is initialized and our service runs on port 3000

access localhost

click on "Try it out"

and enter some random inputs to test our model

and now click execute

As you could see, we got predicted sales data and now let us try to make proper dashboard for it

step3: Dashboard

This is not mandatory for this Demo

In this step, we will create an UI to put our inputs, and here flow goes something like this

Dashboard >>> BentoML service API >>> BentoML model

now let us create another file for dashboard in dashboard.py

import streamlit as st
import requests

st.title("Dairy Sales Forecast Dashboard 🥛🧀")

year = st.number_input("Year", min_value=2020, value=2026)
month = st.selectbox(
    "Month",
    ["jan","feb","mar","apr","may","jun","jul","aug","sep","oct","nov","dec"]
)
product = st.selectbox("Product", ["milk", "curd", "cheese"])

if st.button("Predict Sales"):
    payload = {
        "input_data": {
            "year": year,
            "month": month,
            "product": product
        }
    }

    response = requests.post(
        "http://localhost:3000/forecast",
        json=payload,
        headers={"Content-Type": "application/json"}
    )

    if response.status_code == 200:
        result = response.json()
        st.success(f"Predicted Sales: {result['predicted_sales']:.2f}")
    else:
        st.error(response.text)

Here simple dashbaord we created ask for values and inputs from user and button called predict sales and when pressed this calls our bentoml API service as we saw in step2

but since we are in venv, we need to activate it in venv, but since our bentoml service is running, open a new terminal, activate venv and run dashboard

source  bentoml_live/bin/activate

and now run following command

streamlit run dashboard.py

and dasboard runs on port 8502

enter the values you want to forecast and click on predict sales

This concludes our blog here and now using BentoML, we have saved model, used model as service, deployed our model into Dashboard

This concludes our Blog here.

Sagar Kakkala's World

Search This Blog

BentoML for Beginners 🤖📦 | Deploy Your First Machine Learning Model 🔍🤖 | Sagar Kakkala’s World 🚀🐄📊