Home
Blog
Data Science
Handwritten Digit Recognition with CNN Using Python

Handwritten Digit Recognition with CNN Using Python

Q: 2. Why are Convolutional Neural Networks (CNNs) effective for this task?

CNNs are built to handle image data.They work by automatically detecting edges, textures, and shapes in an image through convolutional filters. This makes them especially useful for recognizing digits, which differ in size, angle, and handwriting style.

By Rohit Sharma

Updated on Aug 04, 2025 | 9 min read | 2.17K+ views

Table of Contents

View all

What You Should Know Before You Begin
Project Timeline & Difficulty Level
Tools/Libraries Used in This Project:
How the Model Recognizes Handwritten Digits
From Pixels to Predictions: Building a Handwritten Digit Recognition Model
Final Conclusion: What We Learned from the Handwritten Digit Recognition Project

Recognizing handwritten digits is a very old challenge in machine learning and computer vision. While humans can easily identify digits, computers struggle with the wide variety of handwriting styles.

In this project, you’ll solve the handwritten digit recognition problem using the Kaggle Sample dataset, which contains thousands of grayscale images of digits from 0 to 9. You will build a Convolutional Neural Network (CNN) to learn and identify these digits accurately.

Want to turn skills into a career? Learn Python, Machine Learning, and more with upGrad’s job-ready Data Science Courses, built to get you hired faster. Explore now.

Build confidence through code. Explore top Python data science projects and start creating work that stands out to recruiters.

What You Should Know Before You Begin

It’s good to have some basic knowledge of the following before starting this Customer Churn Prediction project:

Python programming (variables, functions, loops, basic syntax)
Pandas and Numpy (for handling and analyzing data)
Matplotlib or Seaborn (for creating charts and visualizing trends)
TensorFlow or PyTorch: Build, train, and evaluate Convolutional Neural Networks (CNNs) using these deep learning libraries.
Convolutional Neural Networks (CNNs): Know the basic structure of CNNs including layers like Conv2D, Pooling, and Dense layers.
Model optimization techniques: Familiarity with data augmentation, overfitting prevention, and hyperparameter tuning will help boost accuracy.

Also Read: PyTorch vs TensorFlow: Making the Right Choice for 2025!

Level up your data science game with upGrad’s top-rated courses. Get mentored by industry pros, build real skills, and fast-track your path to a standout career.

Project Timeline & Difficulty Level

Time required: Around 2 to 3 hours
Difficulty level: Moderate

This project is ideal if you're confident with Python and want hands-on experience applying deep learning to real-world image data. You’ll learn how to build and optimize a CNN for handwritten digit recognition using Machine Learning.

Tools/Libraries Used in This Project:

We will use the following tools and Python libraries to build and evaluate the Handwritten Digit Recognition system:

Tool / Library	Purpose
Python	The core programming language used to write and execute the model pipeline
NumPy	Efficient array operations and numerical computing
TensorFlow / PyTorch	Building, training, and testing Convolutional Neural Networks (CNNs)
Google Colab / Jupyter	An interactive environment to run code and visualize model results

Also Read - Step-by-Step Guide to Learning Python for Data Science

How the Model Recognizes Handwritten Digits

To identify handwritten digits accurately, you'll apply deep learning techniques alongside image processing steps. Here’s what the project focuses on:

Convolutional Neural Networks (CNNs) – Used to extract spatial patterns from handwritten images and classify digits from 0 to 9.
Image Preprocessing – Applied steps like grayscale conversion, normalization, and reshaping to prepare images for the model.
Data Augmentation – Improved model generalization by slightly modifying training images (rotation, zoom, etc.).
Model Evaluation – Assessed accuracy, confusion matrix, and loss curves to fine-tune performance and reduce errors.

Also Read - Explaining 5 Layers of Convolutional Neural Network

From Pixels to Predictions: Building a Handwritten Digit Recognition Model

This section walks you through each step to build a handwritten digit recognition using CNN model from the ground up:

Load and Explore the Dataset
Preprocess the image data
Apply data augmentation
Build a Convolutional Neural Network (CNN)
Train and test the model
Evaluate model performance

Without any further delay, let’s get started!

Step 1: Download the Dataset

Download customer data from Kaggle by searching " Handwritten digit recognition using Machine Learning," downloading the ZIP file, extracting it, and using the CSV file for analysis.

Now, after downloading the dataset, move to the next step.

Step 2: Import Libraries and Load the Dataset in Any IDE

Now that you have downloaded both files, upload them to Google Colab using the code below:

from google.colab import files
uploaded = files.upload()

We start by importing important libraries and loading the dataset.


# Import deep learning and utility libraries

import tensorflow as tf

from tensorflow import keras

from tensorflow.keras import layers

import numpy as np

import matplotlib.pyplot as plt

import cv2

from sklearn.model_selection import train_test_split


# Load the built-in MNIST dataset from TensorFlow

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()


# Display the shape of datasets to understand the structure

print(f"Training data shape: {x_train.shape}")

print(f"Training labels shape: {y_train.shape}")

print(f"Test data shape: {x_test.shape}")

Output :

Training data shape: (60000, 28, 28)

Training labels shape: (60000,)

Test data shape: (10000, 28, 28)

Also Read - Top 6 Python IDEs of 2025 That Will Change Your Workflow!

Step 3: Visualize and Preprocess the Data

Before training our model, it's helpful to visualize a few handwritten digits and then preprocess the data so it's ready for input into a Convolutional Neural Network (CNN).

Here is the code for this step :

# Visualize 12 sample images from the training set
plt.figure(figsize=(12, 8))
for i in range(12):
    plt.subplot(3, 4, i+1)
    plt.imshow(x_train[i], cmap='gray')         # Show grayscale image
    plt.title(f'Label: {y_train[i]}')           # Display the digit label
    plt.axis('off')                              # Hide axis for better clarity
plt.tight_layout()
plt.show()

# Define a function to preprocess the data
def preprocess_data(x_train, x_test, y_train, y_test):
    # Scale pixel values to range 0–1
    x_train = x_train.astype('float32') / 255.0
    x_test = x_test.astype('float32') / 255.0

    # Reshape input data to fit CNN: (samples, height, width, channels)
    x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
    x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)

    # Convert labels to one-hot encoded format (for multiclass classification)
    y_train = keras.utils.to_categorical(y_train, 10)
    y_test = keras.utils.to_categorical(y_test, 10)

    return x_train, x_test, y_train, y_test

# Apply preprocessing to the data
x_train, x_test, y_train, y_test = preprocess_data(x_train, x_test, y_train, y_test)

# Check new shape after preprocessing
print(f"Processed training data shape: {x_train.shape}")

Output:

Popular Data Science Programs

PG Diploma in Data Science Masters in Data Science Degree MSc AI and Data Science Program Cloud Computing Courses Certification Data Science Advanced Course

Processed training data shape: (60000, 28, 28, 1)

Step 4: Build a Convolutional Neural Network (CNN)

Now we’ll define a basic CNN architecture to classify handwritten digit recognition. The model consists of convolutional and pooling layers followed by dense layers for classification.

Here is the code for building a CNN model:

# Define a function to create a basic CNN model

def create_basic_cnn():

    model = keras.Sequential([

        # First Convolution + Pooling

        layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),

        layers.MaxPooling2D((2, 2)),



        # Second Convolution + Pooling

        layers.Conv2D(64, (3, 3), activation='relu'),

        layers.MaxPooling2D((2, 2)),



        # Third Convolution layer (no pooling)

        layers.Conv2D(64, (3, 3), activation='relu'),



        # Flatten the feature maps and add Dense layers

        layers.Flatten(),

        layers.Dense(64, activation='relu'),     # Fully connected layer

        layers.Dropout(0.5),                     # Dropout to reduce overfitting

        layers.Dense(10, activation='softmax')   # Output layer for 10 digit classes

    ])

    return model



# Create the CNN model

model = create_basic_cnn()



# Compile the model with optimizer, loss function, and evaluation metric

model.compile(optimizer='adam',

              loss='categorical_crossentropy',

              metrics=['accuracy'])



# Print a summary of the model architecture

model.summary()

Output:

Layer Type	Output Shape	Parameters
Conv2D	(None, 26, 26, 32)	320
MaxPooling2D	(None, 13, 13, 32)	0
Conv2D	(None, 11, 11, 64)	18,496
MaxPooling2D	(None, 5, 5, 64)	0
Conv2D	(None, 3, 3, 64)	36,928
Flatten	(None, 576)	0
Dense	(None, 64)	36,928
Dropout	(None, 64)	0
Dense (Output)	(None, 10)	650

Total params: 93,322 (364.54 KB)

Trainable params: 93,322 (364.54 KB)

Non-trainable params: 0 (0.00 B)

Also Read - CNN vs. RNN: Key Differences and Applications Explained

Step 5: Train and Evaluate the CNN Model

Let’s now train the CNN model using the training dataset and visualize its accuracy and loss over the epochs.

Here is the code:

# Train the model on training data for 10 epochs

history = model.fit(

    x_train, y_train,                 # Input images and labels

    batch_size=128,                  # Number of samples processed before model update

    epochs=10,                       # Total training cycles

    validation_data=(x_test, y_test),# Use test set for validation

    verbose=1                        # Display training progress

)



# Function to visualize training and validation performance

def plot_training_history(history):

    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))



    # Accuracy over epochs

    ax1.plot(history.history['accuracy'], label='Training Accuracy')

    ax1.plot(history.history['val_accuracy'], label='Validation Accuracy')

    ax1.set_title('Model Accuracy per Epoch')

    ax1.set_xlabel('Epoch')

    ax1.set_ylabel('Accuracy')

    ax1.legend()



    # Loss over epochs

    ax2.plot(history.history['loss'], label='Training Loss')

    ax2.plot(history.history['val_loss'], label='Validation Loss')

    ax2.set_title('Model Loss per Epoch')

    ax2.set_xlabel('Epoch')

    ax2.set_ylabel('Loss')

    ax2.legend()



    plt.tight_layout()

    plt.show()



# Call the function to display accuracy and loss plots

plot_training_history(history)

Output:

Conclusion: The model shows excellent performance, achieving over 98% accuracy with steadily decreasing loss, indicating strong learning and generalization.

Step 6: Improve Model with Data Augmentation and Early Stopping

To boost performance and prevent overfitting, we apply random rotations and zooms during training.

Here is the Code:


# Step 6: Define a CNN model with Data Augmentation and Batch Normalization

def create_augmented_model():

    model = keras.Sequential([

        # Data Augmentation Layers: Improve generalization by rotating and zooming

        layers.RandomRotation(0.1),

        layers.RandomZoom(0.1),



        # Convolutional Block 1

        layers.Conv2D(32, (3, 3), activation='relu'),

        layers.BatchNormalization(),

        layers.MaxPooling2D((2, 2)),



        # Convolutional Block 2

        layers.Conv2D(64, (3, 3), activation='relu'),

        layers.BatchNormalization(),

        layers.MaxPooling2D((2, 2)),



        # Convolutional Block 3

        layers.Conv2D(128, (3, 3), activation='relu'),

        layers.BatchNormalization(),



        # Fully Connected Layers

        layers.Flatten(),

        layers.Dense(128, activation='relu'),

        layers.Dropout(0.5),  # Regularization

        layers.Dense(10, activation='softmax')  # Output for 10 digit classes

    ])

    return model



# Create the improved CNN model

improved_model = create_augmented_model()



# Compile with Adam optimizer and categorical cross-entropy loss

improved_model.compile(optimizer='adam',

                       loss='categorical_crossentropy',

                       metrics=['accuracy'])



# Step 6: Train the Model with Early Stopping to avoid overfitting

early_stopping = keras.callbacks.EarlyStopping(

    monitor='val_accuracy',   # Watch validation accuracy

    patience=3,               # Stop after 3 non-improving epochs

    restore_best_weights=True  # Revert to best weights

)



# Train the model

history_improved = improved_model.fit(

    x_train, y_train,

    batch_size=128,

    epochs=15,

    validation_data=(x_test, y_test),

    callbacks=[early_stopping],  # Apply early stopping

    verbose=1

)

Step 7: Model Evaluation and Testing

This section evaluates the improved CNN model on test data, makes predictions, and visualizes the results to show which digits were predicted correctly or incorrectly.

# Evaluate the improved model on the test set

test_loss, test_accuracy = improved_model.evaluate(x_test, y_test, verbose=0)

print(f"Test Accuracy: {test_accuracy:.4f}")



# Generate predicted probabilities for each digit (0 to 9)

predictions = improved_model.predict(x_test)



# Get the predicted class (highest probability) for each test image

predicted_classes = np.argmax(predictions, axis=1)



# Get the actual class labels from one-hot encoded test labels

true_classes = np.argmax(y_test, axis=1)



# Function to visualize a few predictions from the test set

def visualize_predictions(x_test, true_classes, predicted_classes, num_images=12):

    plt.figure(figsize=(15, 10))



    # Loop through the number of images to display

    for i in range(num_images):

        plt.subplot(3, 4, i + 1)



        # Show the digit image in grayscale

        plt.imshow(x_test[i].reshape(28, 28), cmap='gray')



        # Set title color: green if correct, red if incorrect

        color = 'green' if true_classes[i] == predicted_classes[i] else 'red'



        # Show true and predicted labels

        plt.title(f'True: {true_classes[i]}, Pred: {predicted_classes[i]}', color=color)

        plt.axis('off')



    plt.tight_layout()

    plt.show()



# Call the function to display predictions

visualize_predictions(x_test, true_classes, predicted_classes)

Output:

Data Science Courses to upskill

Explore Data Science Courses for Career Progression

Liverpool John Moores University

MS in Data Science

Double Credentials

Master's Degree17 Months

IIIT Bangalore

Executive Post Graduate Certificate in Data Science & AI

Placement Assistance

Certification6 Months

Conclusion- The final prediction grid shows the model accurately classifies most digits, with only minor errors in visually similar cases.

Final Conclusion: What We Learned from the Handwritten Digit Recognition Project

In this project, we built a complete pipeline for handwritten digit recognition using CNNs. We started by preprocessing the dataset, reshaping and normalizing image data, and converting labels to one-hot encoding.

Our best model achieved high accuracy on unseen test data, correctly identifying most digits. Visualization of predictions made the results easy to interpret. This project reinforced the practical use of CNNs in image classification tasks and highlighted key steps from data handling to model evaluation in computer vision workflows.

Unlock the power of data with our popular Data Science courses, designed to make you proficient in analytics, machine learning, and big data!

Explore our Popular Data Science Courses

Executive Post Graduate Programme in Data Science from IIITB	Data Science Bootcamp with AI	Master of Science in Data Science from LJMU
Advanced Certificate Programme in Data Science from IIITB	Professional Certificate Program in Data Science and Business Analytics from University of Maryland	Data Science Courses

Elevate your career by learning essential Data Science skills such as statistical modeling, big data processing, predictive analytics, and SQL!

Subscribe to upGrad's Newsletter

Join thousands of learners who receive useful tips

Promise we won't spam!

Top Data Science Skills to Learn

Data Analysis Course	Inferential Statistics Courses
Hypothesis Testing Programs	Logistic Regression Courses
Linear Regression Courses	Linear Algebra for Analysis

Stay informed and inspired with our popular Data Science articles, offering expert insights, trends, and practical tips for aspiring data professionals!

Read our popular Data Science Articles

Is Data Science Hard to Learn	Data Science Career Growth	What Is Data Science? Courses, Basics, Frameworks & Careers
Future of Data Science in India	The Ultimate Data Science Cheat Sheet Every Data Scientists Should Have	How to Become a Data Scientist

Colab Link:
https://colab.research.google.com/drive/1iRXiyBpNrxG2WLqwZHXmUi1RwnLVvj1C?usp=sharing

Frequently Asked Questions (FAQs)

1. What is the objective of the Handwritten Digit Recognition project?

The goal is to build a system that can identify handwritten digits (0–9) from image data using deep learning. This involves training a Convolutional Neural Network (CNN) to learn visual patterns in digit images and classify them correctly.

2. Why are Convolutional Neural Networks (CNNs) effective for this task?

CNNs are built to handle image data.
They work by automatically detecting edges, textures, and shapes in an image through convolutional filters. This makes them especially useful for recognizing digits, which differ in size, angle, and handwriting style.

3. What preprocessing steps are necessary before training the model?

Resized and reshaped images to fit the model input shape
Normalized pixel values (scaled between 0 and 1) to speed up training
One-hot encoded labels for multi-class classification
Expanded image dimensions to include the single color channel
These steps ensure the model can efficiently process and learn from the image data.

4. How did data augmentation help improve the model’s performance?

Data augmentation increases the variety of training images by applying random transformations like rotation and zoom.

5. What results were achieved, and how can they be used practically?

The final CNN model showed high classification accuracy on the test set. It correctly identified most handwritten digits, even with slight distortions or differences in writing style.

Rohit Sharma

840 articles published

Rohit Sharma is the Head of Revenue & Programs (International), with over 8 years of experience in business analytics, EdTech, and program management. He holds an M.Tech from IIT Delhi and specializes...

Speak with Data Science Expert

By submitting, I accept the T&C and
Privacy Policy

Start Your Career in Data Science Today

Top Resources