Handwritten Digit Recognition with CNN Using Python
By Rohit Sharma
Updated on Jul 28, 2025 | 9 min read | 1.65K+ views
Share:
For working professionals
For fresh graduates
More
By Rohit Sharma
Updated on Jul 28, 2025 | 9 min read | 1.65K+ views
Share:
Table of Contents
Recognizing handwritten digits is a very old challenge in machine learning and computer vision. While humans can easily identify digits, computers struggle with the wide variety of handwriting styles.
In this project, you’ll solve the handwritten digit recognition problem using the Kaggle Sample dataset, which contains thousands of grayscale images of digits from 0 to 9. You will build a Convolutional Neural Network (CNN) to learn and identify these digits accurately.
Want to turn skills into a career? Learn Python, Machine Learning, and more with upGrad’s job-ready Data Science Courses, built to get you hired faster. Explore now.
Build confidence through code. Explore top Python data science projects and start creating work that stands out to recruiters.
It’s good to have some basic knowledge of the following before starting this Customer Churn Prediction project:
Also Read: PyTorch vs TensorFlow: Making the Right Choice for 2025!
Level up your data science game with upGrad’s top-rated courses. Get mentored by industry pros, build real skills, and fast-track your path to a standout career.
This project is ideal if you're confident with Python and want hands-on experience applying deep learning to real-world image data. You’ll learn how to build and optimize a CNN for handwritten digit recognition using Machine Learning.
We will use the following tools and Python libraries to build and evaluate the Handwritten Digit Recognition system:
Tool / Library |
Purpose |
Python | The core programming language used to write and execute the model pipeline |
NumPy | Efficient array operations and numerical computing |
TensorFlow / PyTorch | Building, training, and testing Convolutional Neural Networks (CNNs) |
Google Colab / Jupyter | An interactive environment to run code and visualize model results |
Also Read - Step-by-Step Guide to Learning Python for Data Science
To identify handwritten digits accurately, you'll apply deep learning techniques alongside image processing steps. Here’s what the project focuses on:
Also Read - Explaining 5 Layers of Convolutional Neural Network
This section walks you through each step to build a handwritten digit recognition using CNN model from the ground up:
Without any further delay, let’s get started!
Download customer data from Kaggle by searching " Handwritten digit recognition using Machine Learning," downloading the ZIP file, extracting it, and using the CSV file for analysis.
Now, after downloading the dataset, move to the next step.
Now that you have downloaded both files, upload them to Google Colab using the code below:
from google.colab import files
uploaded = files.upload()
We start by importing important libraries and loading the dataset.
# Import deep learning and utility libraries
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
import matplotlib.pyplot as plt
import cv2
from sklearn.model_selection import train_test_split
# Load the built-in MNIST dataset from TensorFlow
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Display the shape of datasets to understand the structure
print(f"Training data shape: {x_train.shape}")
print(f"Training labels shape: {y_train.shape}")
print(f"Test data shape: {x_test.shape}")
Output :
Training data shape: (60000, 28, 28)
Training labels shape: (60000,)
Test data shape: (10000, 28, 28)
Also Read - Top 6 Python IDEs of 2025 That Will Change Your Workflow!
Before training our model, it's helpful to visualize a few handwritten digits and then preprocess the data so it's ready for input into a Convolutional Neural Network (CNN).
Here is the code for this step :
# Visualize 12 sample images from the training set
plt.figure(figsize=(12, 8))
for i in range(12):
plt.subplot(3, 4, i+1)
plt.imshow(x_train[i], cmap='gray') # Show grayscale image
plt.title(f'Label: {y_train[i]}') # Display the digit label
plt.axis('off') # Hide axis for better clarity
plt.tight_layout()
plt.show()
# Define a function to preprocess the data
def preprocess_data(x_train, x_test, y_train, y_test):
# Scale pixel values to range 0–1
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0
# Reshape input data to fit CNN: (samples, height, width, channels)
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
# Convert labels to one-hot encoded format (for multiclass classification)
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)
return x_train, x_test, y_train, y_test
# Apply preprocessing to the data
x_train, x_test, y_train, y_test = preprocess_data(x_train, x_test, y_train, y_test)
# Check new shape after preprocessing
print(f"Processed training data shape: {x_train.shape}")
Output:
Popular Data Science Programs
Processed training data shape: (60000, 28, 28, 1)
Now we’ll define a basic CNN architecture to classify handwritten digit recognition. The model consists of convolutional and pooling layers followed by dense layers for classification.
Here is the code for building a CNN model:
# Define a function to create a basic CNN model
def create_basic_cnn():
model = keras.Sequential([
# First Convolution + Pooling
layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
layers.MaxPooling2D((2, 2)),
# Second Convolution + Pooling
layers.Conv2D(64, (3, 3), activation='relu'),
layers.MaxPooling2D((2, 2)),
# Third Convolution layer (no pooling)
layers.Conv2D(64, (3, 3), activation='relu'),
# Flatten the feature maps and add Dense layers
layers.Flatten(),
layers.Dense(64, activation='relu'), # Fully connected layer
layers.Dropout(0.5), # Dropout to reduce overfitting
layers.Dense(10, activation='softmax') # Output layer for 10 digit classes
])
return model
# Create the CNN model
model = create_basic_cnn()
# Compile the model with optimizer, loss function, and evaluation metric
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
# Print a summary of the model architecture
model.summary()
Output:
Layer Type |
Output Shape |
Parameters |
Conv2D | (None, 26, 26, 32) | 320 |
MaxPooling2D | (None, 13, 13, 32) | 0 |
Conv2D | (None, 11, 11, 64) | 18,496 |
MaxPooling2D | (None, 5, 5, 64) | 0 |
Conv2D | (None, 3, 3, 64) | 36,928 |
Flatten | (None, 576) | 0 |
Dense | (None, 64) | 36,928 |
Dropout | (None, 64) | 0 |
Dense (Output) | (None, 10) | 650 |
Total params: 93,322 (364.54 KB)
Trainable params: 93,322 (364.54 KB)
Non-trainable params: 0 (0.00 B)
Also Read - CNN vs. RNN: Key Differences and Applications Explained
Let’s now train the CNN model using the training dataset and visualize its accuracy and loss over the epochs.
Here is the code:
# Train the model on training data for 10 epochs
history = model.fit(
x_train, y_train, # Input images and labels
batch_size=128, # Number of samples processed before model update
epochs=10, # Total training cycles
validation_data=(x_test, y_test),# Use test set for validation
verbose=1 # Display training progress
)
# Function to visualize training and validation performance
def plot_training_history(history):
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
# Accuracy over epochs
ax1.plot(history.history['accuracy'], label='Training Accuracy')
ax1.plot(history.history['val_accuracy'], label='Validation Accuracy')
ax1.set_title('Model Accuracy per Epoch')
ax1.set_xlabel('Epoch')
ax1.set_ylabel('Accuracy')
ax1.legend()
# Loss over epochs
ax2.plot(history.history['loss'], label='Training Loss')
ax2.plot(history.history['val_loss'], label='Validation Loss')
ax2.set_title('Model Loss per Epoch')
ax2.set_xlabel('Epoch')
ax2.set_ylabel('Loss')
ax2.legend()
plt.tight_layout()
plt.show()
# Call the function to display accuracy and loss plots
plot_training_history(history)
Output:
Conclusion: The model shows excellent performance, achieving over 98% accuracy with steadily decreasing loss, indicating strong learning and generalization.
To boost performance and prevent overfitting, we apply random rotations and zooms during training.
Here is the Code:
# Step 6: Define a CNN model with Data Augmentation and Batch Normalization
def create_augmented_model():
model = keras.Sequential([
# Data Augmentation Layers: Improve generalization by rotating and zooming
layers.RandomRotation(0.1),
layers.RandomZoom(0.1),
# Convolutional Block 1
layers.Conv2D(32, (3, 3), activation='relu'),
layers.BatchNormalization(),
layers.MaxPooling2D((2, 2)),
# Convolutional Block 2
layers.Conv2D(64, (3, 3), activation='relu'),
layers.BatchNormalization(),
layers.MaxPooling2D((2, 2)),
# Convolutional Block 3
layers.Conv2D(128, (3, 3), activation='relu'),
layers.BatchNormalization(),
# Fully Connected Layers
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dropout(0.5), # Regularization
layers.Dense(10, activation='softmax') # Output for 10 digit classes
])
return model
# Create the improved CNN model
improved_model = create_augmented_model()
# Compile with Adam optimizer and categorical cross-entropy loss
improved_model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
# Step 6: Train the Model with Early Stopping to avoid overfitting
early_stopping = keras.callbacks.EarlyStopping(
monitor='val_accuracy', # Watch validation accuracy
patience=3, # Stop after 3 non-improving epochs
restore_best_weights=True # Revert to best weights
)
# Train the model
history_improved = improved_model.fit(
x_train, y_train,
batch_size=128,
epochs=15,
validation_data=(x_test, y_test),
callbacks=[early_stopping], # Apply early stopping
verbose=1
)
This section evaluates the improved CNN model on test data, makes predictions, and visualizes the results to show which digits were predicted correctly or incorrectly.
# Evaluate the improved model on the test set
test_loss, test_accuracy = improved_model.evaluate(x_test, y_test, verbose=0)
print(f"Test Accuracy: {test_accuracy:.4f}")
# Generate predicted probabilities for each digit (0 to 9)
predictions = improved_model.predict(x_test)
# Get the predicted class (highest probability) for each test image
predicted_classes = np.argmax(predictions, axis=1)
# Get the actual class labels from one-hot encoded test labels
true_classes = np.argmax(y_test, axis=1)
# Function to visualize a few predictions from the test set
def visualize_predictions(x_test, true_classes, predicted_classes, num_images=12):
plt.figure(figsize=(15, 10))
# Loop through the number of images to display
for i in range(num_images):
plt.subplot(3, 4, i + 1)
# Show the digit image in grayscale
plt.imshow(x_test[i].reshape(28, 28), cmap='gray')
# Set title color: green if correct, red if incorrect
color = 'green' if true_classes[i] == predicted_classes[i] else 'red'
# Show true and predicted labels
plt.title(f'True: {true_classes[i]}, Pred: {predicted_classes[i]}', color=color)
plt.axis('off')
plt.tight_layout()
plt.show()
# Call the function to display predictions
visualize_predictions(x_test, true_classes, predicted_classes)
Output:
Conclusion- The final prediction grid shows the model accurately classifies most digits, with only minor errors in visually similar cases.
In this project, we built a complete pipeline for handwritten digit recognition using CNNs. We started by preprocessing the dataset, reshaping and normalizing image data, and converting labels to one-hot encoding.
Our best model achieved high accuracy on unseen test data, correctly identifying most digits. Visualization of predictions made the results easy to interpret. This project reinforced the practical use of CNNs in image classification tasks and highlighted key steps from data handling to model evaluation in computer vision workflows.
Unlock the power of data with our popular Data Science courses, designed to make you proficient in analytics, machine learning, and big data!
Elevate your career by learning essential Data Science skills such as statistical modeling, big data processing, predictive analytics, and SQL!
Stay informed and inspired with our popular Data Science articles, offering expert insights, trends, and practical tips for aspiring data professionals!
Colab Link:
https://colab.research.google.com/drive/1iRXiyBpNrxG2WLqwZHXmUi1RwnLVvj1C?usp=sharing
804 articles published
Rohit Sharma is the Head of Revenue & Programs (International), with over 8 years of experience in business analytics, EdTech, and program management. He holds an M.Tech from IIT Delhi and specializes...
Speak with Data Science Expert
By submitting, I accept the T&C and
Privacy Policy
Start Your Career in Data Science Today
Top Resources