Programs

A Step-By-Step Guide to Deploying ML Models Using Streamlit

Introduction

Most of the Machine Learning enthusiasts are enrolling themselves for some courses and curriculums for getting started with AI and ML. These courses cover a lot of fundamental stuff and neatly guide the learners to build and train the state of the art ML models.

But one thing that most of the beginners struggle with is the deployment part. A machine learning project cannot be left as is without any interactive app around it. To allow users to easily interact with the model or even to showcase our projects we need to wrap them into web apps, android apps, or some kind of API structures using cloud services.

There are various ways to build these wrappers for our models, but in this article, I’ll focus on how you can use Streamlit as the solution for this problem and why I consider it such a powerful tool.

This article is going to have a step to step guide for building an ML project and making a web app platform for it using Streamlit. The project which we will be building is a California House Price Prediction model. The site will be dynamic and hyperparameters like the learning rate, number of neurons, etc. can be changed and experimented with right through the web app. 

If you go forward with building such a web app using some frameworks like Flask or Django, I am almost certain that it’ll take a lot of time to first build that UI, and then there’s another problem of hosting it on a server so that it can be accessible to everyone.

And here rises the main question of, ‘Why should Machine Learning enthusiasts have to waste their time on learning some UI frameworks when they can instead use that valuable time in learning to build better models?’

There are going to be a lot of topics covered here about how to use Streamlit for your projects. Feel free to skip to whichever parts you want to know more about.

  • Why Streamlit?
  • Building a basic ML model
  • Adding the magic-using Streamlit
  • Deploying the Streamlit web app
  • Conclusion

Why Streamlit?

Streamlit makes it very easy and quick for us to build a Machine Learning web app. Other ways that are available for developing such wrappers for ML models are not very comfortable. 

Flask is a python framework that allows the users to develop web apps and deploy them using python language. It requires a good knowledge of python and also we need to invest time in learning it. Even after that, it is not very easy to develop a web app when compared to Streamlit.

Django is another Python-based framework for web development. One can say that it is a better and complex version of Flask. It requires a lot of dedicated time to learn this framework and finally building a web app using it is not as quick as we might want it to be.

Tensorflow.js is a very great way of saving models that are compatible with web platforms and then these models can be used to build web apps. Many of the complex implementations of ML models and high-level architectures are not yet supported by Tensorflow.js. There are many models that will work in Python and might not work on Javascript in the Tensorflow.js library. 

As I said earlier, we should not be wasting our time learning these frameworks and instead learn how to build good ML models. And this is where Streamlit comes into the picture. It is the simplest and swiftest way to develop web applications. The web apps build using Streamlit have great UI elements and are very easy to use. 

To support my claim of Streamlit being the easiest and quickest way of building ML web apps, let me share with you how I came across this framework. I was learning how to build GANs and use it to generate artificial faces, convert black and white sketches to colorful ones, and such implementations.

The models worked well in the Jupyter notebook but I wanted to share it with others. I started searching for frameworks to build an app and host the model but I did not want to waste my time in learning yet another framework as I wanted to explore other GAN implementations. 

I checked out all the alternative resources that I spoke about in this article earlier. The generator model used in the Sketch-To-Color generation project is a little complex. It is a U-Net architecture model and requires you to skip connections.

Due to its high complexity, I was unable to convert the final model for Javascript using Tensorflow.js. Learning Flask or Django from scratch was not an option for me so I started searching for some other frameworks or libraries. 

This is when I came across Streamlit. In an article by Adrien Treuille, he shows how he built an amazing web app for a TL-GAN in under 13 lines of code. This was all possible only because of Streamlit. 

The documentation on their official website is also very precise and helpful. I tried making a Streamlit web app for my Sketch to Color GANs model and it was amazing. I only had to add 12 lines of code to my existing python code. This is why I finally went forward with exploring Streamlit and building other projects using it. 

Building a Basic ML Model

As stated earlier, we are going to look at the California House Price Prediction problem for this example. First of all, let’s see how we normally build a model for this. It is a regression problem. 

First, we will import the required libraries for our simple ML model. Here we will be using TensorFlow, pandas, and NumPy. 

import tensorflow as tf

import numpy as np

import pandas as pd

Now, we will use the datasets from Scikit-Learn to download the California housing dataset. 

from sklearn.datasets import fetch_california_housing

housing = fetch_california_housing()

Next, we need to split the loaded data into train, validation, and test sets. There are many methods available to do this. We will use the train_test_split function available in the Scikit-Learn library. Using it twice will divide the dataset into 3 sets of train, validation, and test. 

from sklearn.model_selection import train_test_split

X_train_full, X_test, y_train_full, y_test = train_test_split(

    housing.data, housing.target

)

X_train, X_valid, y_train, y_valid = train_test_split(

    X_train_full, y_train_full

)

The data available to us has longitude and latitude values of each entry. To visualize this better, we can make a scatter plot on a map. To use the maps, we will import the Altair library. 

Import altair as alt

map_data = pd.DataFrame(

        X_train,

        columns=[

            ‘MedInc’, 

            ‘HouseAge’, 

            ‘AveRooms’, 

            ‘AveBedrms’, 

            ‘Population’, 

            ‘AveOccup’, 

            ‘latitude’, 

            ‘longitude’

            ])

    midpoint = (np.average(map_data[“latitude”]), np.average(map_data[“longitude”]))

    st.write(pdk.Deck(

    map_style=”mapbox://styles/mapbox/light-v9″,

    initial_view_state={

        “latitude”: midpoint[0],

        “longitude”: midpoint[1],

        “zoom”: 6,

        “pitch”: 75,

    },

    layers=[

        pdk.Layer(

            “HexagonLayer”,

            data=map_data,

            get_position=[“longitude”, “latitude”],

            radius=1000,

            elevation_scale=4,

            elevation_range=[0, 10000],

            pickable=True,

            extruded=True,

        ),

    ],

))

The data we have with us now is not processed well for the model. We need to do the preprocessing in order to get better results. First of all, we will need to normalize the values as most of the Machine Learning models work best when the data is in a small range of values. For this, we will use the StandardScaler class from the sklearn library.

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

X_train = scaler.fit_transform(X_train)

X_valid = scaler.transform(X_valid)

X_test = scaler.transform(X_test)

After preparing the data, we will now create a model. This model will be a neural network with a specified number of neurons in the first input layer and a single neuron in the last layer as it is a regression problem. This can be achieved by using the TensorFlow library. 

import tensorflow as tf

    model = tf.keras.models.Sequential([

        tf.keras.layers.Dense(n_neurons, activation=’relu’, input_shape=X_train.shape[1:]),

        tf.keras.layers.Dense(1)

    ])

In TensorFlow, we have to compile the model after building it. Here we have to mention the loss function that we will be using and also the optimizer that we want. We will be using the mean squared error loss function and the SGD optimizer with a specific learning rate. 

model.compile(

        loss=’mean_squared_error’,

        optimizer=tf.keras.optimizers.SGD(l_rate)

    )

Everything is in place now. All we have to do is train the model. In TensorFlow, this can be done by simply calling the fit() method. We can save all the logs in a variable, history

history = model.fit(

                X_train,

                y_train,

                epochs=n_epochs,

                validation_data=(X_valid, y_valid)

            )

After the training, we can also evaluate our model’s accuracy and loss by using the evaluate() method. 

evaluation = model.evaluate(X_test, y_test)

Now, if we want to predict any house prices by using this model, we can just do it by calling the method, predict()

X_new = X_test[:3]

        predictions = model.predict(X_new)

This is how you can build a simple house price prediction model using TensorFlow, scikit-learn, and pandas. But the problem as you can see is that there is no way to show this model to other users where they can interact with it and this is just a model inside a Jupyter Notebook. So now let’s add in some magic with Streamlit!

Also Read: Machine Learning Project Ideas

Adding the Magic Using Streamlit

To make a Machine Learning web app, you just need to add a few lines of code for Streamlit function calls and that’s it. You don’t need any HTML, CSS, or Javascript. Just pure python!

Yes, you read it correctly. You need not worry about anything else. Just install Streamlit onto your system and you’ll be ready to go. Use the following command in your terminals:

pip install streamlit

You can use the following command to explore around in their hello world app. It is a good example of how the web apps look using Streamlit.:

streamlit hello

After having installed Streamlit locally and making adding the magical line to the code you need to just execute the following command to run the app locally:

streamlit run file_name.py

So the question now is, “What are those magical lines of code?” They are quite simple. I’ll first explain the basic functions used in Streamlit and then I’ll show the code so that you can directly relate it with the example. 

Before anything else, we will import the streamlit library by using the following line of code: 

import streamlit as st

The first important feature is that you can simply type anything in the 3 double quotes it simply shows the text as it is on the web app. It supports markdown language syntax. So you can do a lot of things like headers, bullet points, tables, and much more.  You can also use the st.write() function instead of this notation. It has the same functionality.

Next is the with st.echo(): function. This basically executes the python code written in it and then it also shows it on the web app. This way we can build a web app that shows how it was built. 

st.empty() is an area that is reserved for some dynamic content later on.

st.spinner() shows a loading element when there is some delay in executing a piece of code.  

st.success() shows a message in green color. This has a great design aspect of success dialogues. 

st.sidebar() displays the content in a sidebar to the left by default. 

st.sidebar.slider() provides a slider in the sidebar to choose values from a range of given numbers. st.sidebar.selectbox() allows you to select a value from the given list and st.sidebar.number_input() is for the numeric input from the user.

Streamlit has many more wonderful functions and features packed in with it. Some of the features are as follows:

  • Live changes when you save the file
  • Rerun the app by simply pressing R on the keyboard
  • A clear cache by simply pressing C on the keyboard
  • Record the web app and save a video file locally to share with everyone

…And much more

Must Read: Career in Machine Learning

The Code

import streamlit as st

import altair as alt

import pydeck as pdk

train_area = st.empty()

“””

# California Housing Prices

This is the California Housing Prices dataset which contains data drawn from the 1990 U.S. Census. The following table provides descriptions, data ranges, and data types for each feature in the data set.

## Let’s first take a look at imports

“””

with st.echo():

    import tensorflow as tf

    import numpy as np

    import pandas as pd

“””

## Loading the Dataset

We will use the scikit-learn’s dataset module to lead data which is already cleaned for us and only has the numerical features. 

“””

with st.echo():

    from sklearn.datasets import fetch_california_housing

    housing = fetch_california_housing()

“””

This will load the entire data in the `housing` variable as you can see below

“””

st.subheader(‘Input Features’)

housing.data

st.subheader(‘Output Labels’)

housing.target

“””

## Splitting the data into Train, Test, and Dev sets

This is one of the most important things at the beginning of any Machine Learning solution as the result of any model can highly depend on how well you have distributed the data into these sets. 

Fortunately for us, we have scikit-learn to the rescue where it has become as easy as 2 lines of code.

“””

with st.echo():

    from sklearn.model_selection import train_test_split

    X_train_full, X_test, y_train_full, y_test = train_test_split(

        housing.data, housing.target

    )

    X_train, X_valid, y_train, y_valid = train_test_split(

        X_train_full, y_train_full

    )

“””

The `train_test_split()` function splits the data into 2 sets where the test set is 25% of the total dataset. We have used the same function again on the train_full to split it into train and validation sets. 25% is a default parameter and you can tweak it as per your needs. Take a look at it from the [Scikit-Learn’s Documentation](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html).

## Taking a look at the train data

The columns represent the following data:

“””

st.write(housing.feature_names)

“””

Now let’s look at the location of the houses by plotting it on the map using Latitude and Longitude values:

“””

with st.echo():

    map_data = pd.DataFrame(

        X_train,

        columns=[

            ‘MedInc’, 

            ‘HouseAge’, 

            ‘AveRooms’, 

            ‘AveBedrms’, 

            ‘Population’, 

            ‘AveOccup’, 

            ‘latitude’, 

            ‘longitude’

            ])

    midpoint = (np.average(map_data[“latitude”]), np.average(map_data[“longitude”]))

    st.write(pdk.Deck(

    map_style=”mapbox://styles/mapbox/light-v9″,

    initial_view_state={

        “latitude”: midpoint[0],

        “longitude”: midpoint[1],

        “zoom”: 6,

        “pitch”: 75,

    },

    layers=[

        pdk.Layer(

            “HexagonLayer”,

            data=map_data,

            get_position=[“longitude”, “latitude”],

            radius=1000,

            elevation_scale=4,

            elevation_range=[0, 10000],

            pickable=True,

            extruded=True,

        ),

    ],

))

“””

**Feel free to zoom in or drag while pressing ALT key to change the 3D viewing angle of the map, as required.**

## Preprocessing

As pointed out earlier, this dataset is already well preprocessed by scikit-learn for us to use directly without worrying about any NaN values and other stuff.

Although, we are going to scale the values in specific ranges by using `StandardScaler` to help our model work efficiently.

“””

with st.echo():

    from sklearn.preprocessing import StandardScaler

    scaler = StandardScaler()

    X_train = scaler.fit_transform(X_train)

    X_valid = scaler.transform(X_valid)

    X_test = scaler.transform(X_test)

“””

## Creating a model

We will be creating a simple Sequential Model with the first layer containing 30 neurons and the activation function of RELU.

The next layer will be a single neuron layer with no activation function as we want the model to predict a range of values and not just binary or multiclass results like classification problems.

“””

st.sidebar.title(‘Hyperparameters’)

n_neurons = st.sidebar.slider(‘Neurons’, 1, 128, 30)

l_rate = st.sidebar.selectbox(‘Learning Rate’, (0.0001, 0.001, 0.01), 1)

n_epochs = st.sidebar.number_input(‘Number of Epochs’, 1, 50, 20)

#The n_neurons, l_rate, and _nepochs are the inputs taken from the user for training the model. The default values for them are also set. Default value for n_neurons is 30, the default value for l_rate is 0.01 and the default value for n_epochs is 20. So at the beginning the model will have 30 neurons in the first layer, the learning rate will be 0.01 and the number of epochs for which the model will train for is 20. 

with st.echo():

    import tensorflow as tf

    

    model = tf.keras.models.Sequential([

        tf.keras.layers.Dense(n_neurons, activation=’relu’, input_shape=X_train.shape[1:]),

        tf.keras.layers.Dense(1)

    ])

“””

## Compiling the model

Tensorflow keras API provides us with the `model.compile()` function to assign the optimizers, loss function and a few other details for the model.

“””

with st.echo():

    model.compile(

        loss=’mean_squared_error’,

        optimizer=tf.keras.optimizers.SGD(l_rate)

    )

“””

## Training the model

In order to train the model you simply have to call the `fit()` function on the model with training and validation set and a number of epochs you want the model to train for.

**Try playing with the hyperparameters from the sidebar on the left side and click on the `Train Model` button given below to start the training.**

“””

train = st.button(‘Train Model’)

if train:

    with st.spinner(‘Training Model…’):

        with st.echo():

            model.summary(print_fn=lambda x: st.write(“{}”.format(x)))

            history = model.fit(

                X_train,

                y_train,

                epochs=n_epochs,

                validation_data=(X_valid, y_valid)

            )

    st.success(‘Model Training Complete!’)

    “””

    ## Model Performance

    “””

    with st.echo():

        st.line_chart(pd.DataFrame(history.history))

    “””

    ## Evaluating the model on the Test set

Again another important but easy step to do is to evaluate your model on the test data which it has never seen before. Remember that you should only do this after you are sure enough about the model you’ve built and you should resist making any hyperparameter tuning after evaluating the model on the test set as it would just make it better for the test set and again there will be a generalization problem when the model will see new data in the production phase.

    “””

    with st.echo():

        evaluation = model.evaluate(X_test, y_test)

        evaluation

    “””

    > This loss on the test set is a little worse than that on the validation set, which is as expected, as the model has never seen the images from the test set.

    “””

    “””

    ## Predictions using the Model

    “””

    with st.echo():

        X_new = X_test[:3]

        predictions = model.predict(X_new)

    “””

    ### Predictions

    “””

    predictions

    “””

    ### Ground Truth

    “””

    y_test[:3]

 

This was it! Only a few lines of extra code and you have already built a great web app that looks beautiful and has dynamic content too. It wasn’t that difficult, was it? Try building different projects and using other functions of Streamlit from their documentation. It’s quite easy and intuitive.

Read: Machine Learning with Python

Deploying the Streamlit Web App

Streamlit web apps can be deployed for direct use through various options available on the internet. We can go through them briefly and see how it can be done. 

Before going onto any other platforms that can help us deploy our web apps, let’s see what Streamlit has got to offer us. A very recent feature release that they have done is Streamlit Sharing. This allows the users to deploy their Streamlit web apps in a single click.

If you have your code uploaded on your GitHub repositories, you can simply choose the repository from Streamlit’s dashboard and it’ll automatically host it for you. It is very amazing and totally free as of now. There hasn’t been any easier way before to deploy machine learning web apps.

Heroku is another good method to deploy the Streamlit web app. This way you won’t have to pick any cloud servers and then set up virtual instances in them. It is all taken care of by Heroku.

There is one simple drawback in Heroku that its free version won’t allow you to have all the packages more than the size of 512MB in a free version.  TensorFlow 2.2.0 which I used for the project is a little bigger than the specified space so I had to use other services. 

AWS (Amazon Web Services) is also a nice way to deploy your Streamlit apps. It is a little bit complex for a beginner but as you use it, it becomes easier to set up. They provide free EC2 instances for new users. You can launch one with Ubuntu 18.04 or higher and install all the dependencies that are required for the app.

After everything is set up, you can run the app by using the command – streamlit run filename.py. Here, you’ll get a public URL that can be shared with everyone. One major drawback here is that the app is not available online if you shut down the instance. So a free instance will have some limitations. 

If you have the code on the GitHub repository, there is another cool way to host your app. It is not very professional and legit because the user needs to have Streamlit installed on their systems too.

If Streamlit is available on the system and you have the link for the Streamlit app’s python file, then you can run the web app by simply executing the command: streamlit run url. You can check out my app locally if you have installed Streamlit. Use the following command in your terminals: 

streamlit run https://raw.githubusercontent.com/tejasmorkar/housing_price_prediction_aws/master/CaliforniaHousingPrices.py

Conclusion

You have seen how simple yet powerful Streamlit is. I haven’t encountered such a tool before that helped me to this extent and made my development life easier. So, this is why I feel that Streamlit is an impressive framework that can help everyone focus on the important parts of Machine Learning development and help us concentrate more on the major learnings of AI and ML. This makes the learning curve much easier and enables us to build and deploy hobby projects easily.

One thing that makes this framework so simple for implementation is their official documentation. Everything written in the docs is precise and plain. I suggest that you should go through the docs once and try implementing a new project. It is the best way to get started with any new frameworks. Find the Streamlit official documentation on the following link — https://docs.streamlit.io/en/stable/ target=”_blank” rel=”nofollow”.

Community is always the best resource to learn things and find a problem with our solutions. Streamlit has a wonderful discussion forum where you can post any questions regarding the development process of a Streamlit app, any doubts regarding deployment, feature requests, bug reports, and anything else that might help you build your app successfully. Join the discussion forum on the following link — https://discuss.streamlit.io/

If you’re interested to learn more about machine learning, check out IIIT-B & upGrad’s PG Diploma in Machine Learning & AI which is designed for working professionals and offers 450+ hours of rigorous training, 30+ case studies & assignments, IIIT-B Alumni status, 5+ practical hands-on capstone projects & job assistance with top firms.

Lead the AI Driven Technological Revolution

PG DIPLOMA IN MACHINE LEARNING AND ARTIFICIAL INTELLIGENCE
Learn More

Leave a comment

Your email address will not be published.

Accelerate Your Career with upGrad

Our Popular Machine Learning Course

×