Deep Learning with Keras: Training Neural Network With Keras [With Code]

Keras is a Python library which provides an API to work with Neural Networks and Deep Learning frameworks. Keras provides functions and modules which are extremely handy when dealing with various Deep Learning applications in Python.

By the end of this tutorial, you’ll have the knowledge of the following:

  1. What is Keras?
  2. Keras APIs
  3. Training a Neural Network in Keras

What is Keras?

To do Deep Learning, the most used library earlier was Tensorflow 1.x which was tough to deal with for beginners. It required a lot of coding to just make a basic 1 layer network. However, with Keras, the complete process of making the structure of a Neural Network and then training & tracking it became extremely easy. 

Keras is a high-level API which can run on Tensorflow, Theano and CNTK backend. It gives us the ability to run experiments using neural networks using high-level and user-friendly API. It is also capable of running on CPUs and GPUs.

Keras APIs

Keras has 10 different API modules meant to handle modelling and training the neural networks. Let’s go over each of them to understand what all Keras has to offer.


The Models API provides the functionality to build complex neural networks by adding/removing layers. The model can either be Sequential, meaning, that the layers will be stacked up sequentially with single input and output. The model can also be functional, meaning, with fully customizable models. This is what is used mostly in the industry. 

The API also has the training module which provides methods to compile the model along with the optimizer and the loss function, a method to fit the model, a method to evaluate and to predict on new data as well. Moreover, it also has methods for training, testing and predicting on a batch data as well. Models API also has the saving and serialization of the models as well.


Layers are the building blocks of any neural network. The Layers API offers a complete set of methods for building the neural net architecture. The Layers API has the Base Layer class which contains methods needed to build custom layers with custom weights and initializers.

It contains the Layer Activations class which consists of the various activation functions such as ReLU, Sigmoid, Tanh, Softmax, etc. The Layer Weight Initializers class offers methods to initialize weights using different ways. 

It also consists of the Core Layers class which consists of classes required to add core layers like Dense layer, Activation Layer, Embedding layer, etc. The Convolution Layer class offers various methods to add different types of Convolution layers. The Pooling Layers class contains the methods required for different types of pooling such as Max Pooling, Average Pooling, Global Max Pooling and Global Average Pooling.


Callbacks are a way to track the training process of the model. With Callback enabled, various actions can be performed before or after the end of an epoch or a batch. With callbacks you can:

  • Monitor the training metrics by logging the metrics in TensorFlow Board
  • Saving the model to disk periodically
  • Early stopping in case the loss is not decreasing significantly after a certain epochs
  • View internal states and statistics of a model during training

Dataset Preprocessing

The data usually is in raw format and lying in arranged directories and needs to be preprocessed before feeding to the model for fitting. The Image Data Preprocessing class has many such dedicated functions. For example, the image data needs to be in the numerical array for which we can use the img_to_array function. Or if the images are present inside a directory and subfolders, we can use the image_dataset_from_directory function. 

Data Preprocessing API also has classes for Time Series data and text data as well.


Optimizers are the backbone of any neural network. Every neural network works on optimizing a loss function to find the best weights for prediction. There are multiple kinds of optimizers which follow slightly different techniques to find optimal weights. All these optimizers are available in the Optimizers API – SGD, RMSProp, Adam, Adadelta, Adagrad, Adamax, Nadal, FTRL.


Loss functions are needed to be specified when compiling a model. This loss function would be optimized by the optimizer which was also passed as a parameter in the compile method. The three main loss classes are: Probabilistic Losses, Regression losses, Hinge losses. 


Metrics are used in every ML model to quantify its performance on test data. Metrics are similar to Loss Functions, except that they are used on test data. There are many Accuracy metrics available such as Accuracy, Binary Accuracy, Categorical Accuracy, etc. It also contains Probabilistic metrics such as Binary Cross Entropy, Categorical Cross Entropy, etc. There are metrics for checking false positives/negatives as well such as AUC, Precision, Recall, etc.

Apart from these classification metrics, it also has regression metrics such as Mean Squared Error, Root Mean Squared Error, Mean Absolute Error, etc. 

Also Read: The What’s What of Keras and TensorFlow

Keras Applications

The Keras Applications class consists of some prebuilt models along with pre-trained weights. These pre-trained models are used for the process of Transfer Learning. These pre-trained models are different depending on the architecture, number of layers, trainable weights, etc. Some of them are Xception, VGG16, Resnet50, MobileNet, etc.

Training a Neural Network with Keras

Let’s consider a simple dataset such as MNIST which is available in the datasets class of Keras. We will make a simple sequential Convolutional Neural Network for classifying the handwritten images of digits 0-9. 

#Loading the dataset

from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

Normalizing the dataset by dividing each pixel by 255. Also, it is needed to transform the images into 4 dimensions before feeding it to a CNN model.

x_train = x_train.astype(‘float32’)
x_test = x_test.astype(‘float32’)
x_train /= 255
x_test /= 255

x_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
x_test = X_test.reshape(X_test.shape[0], 28, 28, 1)

We need to label encode the classes before feeding it to the model. We’ll do that by using Keras’ Utils class.

from keras.utils import to_categorical

y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

Now, we can start creating the model using Sequential API.

from keras.models import Sequential
from keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropoutmodel = Sequential()model.add(Conv2D(filters=32, kernel_size=(5,5), activation=‘relu’, input_shape=x_train.shape[1:]))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation=‘relu’))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dense(256, activation=‘relu’))
model.add(Dense(10, activation=‘softmax’))

In the above code, we declare a sequential model and then add multiple layers to it. The convolutional layer, followed by a Max Pooling layer and then a dropout layer for regularization. Later we flatten the output using the flatten layer and the last layer is a fully connected dense layer with 10 nodes.  

Next, we need to compile it by passing the loss function, the optimizer and the metric.


Now we’d need to add augmented images from the original images to increase the training set and accuracy of the model. We will do that using the ImageDataGenerator function.

from keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(

Now that the model is compiled, and images augmented, we can start the training process. As we have used an Image Data Generator above, we will use the fit_generator method rather than just fit.

epochs = 3
batch_size = 32
history = model.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size), epochs=epochs,
validation_data=(x_test, y_test), steps_per_epoch=x_train.shape[0]//batch_size


The output of the training process is as follows:

3Now the model is trained and can be evaluated by running it on unseen test data.

Related Read: Deep Learning & Neural Networks with Keras

Before you go

In this tutorial, we saw how well Keras is structured and makes it easy for a complex neural network to be built. Keras is now wrapped under Tensorflow 2.x which gives it even more features. Try out more such examples and explore the functions and features of Keras.

If you’re interested to learn more about machine learning, check out IIIT-B & upGrad’s PG Diploma in Machine Learning & AI which is designed for working professionals and offers 450+ hours of rigorous training, 30+ case studies & assignments, IIIT-B Alumni status, 5+ practical hands-on capstone projects & job assistance with top firms.

Lead the AI Driven Technological Revolution

Learn More

Leave a comment

Your email address will not be published. Required fields are marked *

Our Popular Machine Learning Course

Accelerate Your Career with upGrad