What do you see first when you look at your selfie? Your face, right? You can spot your face because your brain is capable of identifying your face and separate it from the rest of the image (the background).
Now, if you wanted your computer to recognize your face in a selfie, would it be able to do that?
Yes, provided it can perform image segmentation.
In today’s article, we’ll discuss image segmentation and all of its major aspects including the various image segmentation techniques you can use. However, it’s a long read so we recommend bookmarking this article so you can come back to it later.
Before we start discussing the various techniques for segmentation in image processing, we should first figure out, “What is image segmentation?”
Table of Contents
What is Image Segmentation?
Image segmentation is a branch of digital image processing which focuses on partitioning an image into different parts according to their features and properties. The primary goal of image segmentation is to simplify the image for easier analysis. In image segmentation, you divide an image into various parts that have similar attributes. The parts in which you divide the image are called Image Objects.
It is the first step for image analysis. Without performing image segmentation, performing computer vision implementations would be nearly impossible for you.
By using image segmentation techniques, you can divide and group-specific pixels from an image, assign them labels and classify further pixels according to these labels. You can draw lines, specify borders, and separate particular objects (important components) in an image from the rest of the objects (unimportant components).
In machine learning, you can use the labels you generated from image segmentation for supervised and unsupervised training. This would allow you to solve many business problems.
An example would be better to understand how image segmentation works.
Look at the following image.
Here, you can see a chair placed in the middle of a road. By using image segmentation, you can separate the chair from the image. Moreover, you can use numerous image segmentation techniques to get different results. For example, if you wanted to use an image having multiple chairs, you’ll have to use semantic segmentation.
On the other hand, if you wanted to identify every chair present in an image such as the following, you’ll have to use instance segmentation:
Why is Image Segmentation Necessary?
Image segmentation is a large aspect of computer vision and has many applications in numerous industries. Some of the notable areas where image segmentation is used profusely are:
1. Face Recognition
The facial recognition technology present in your iPhone and advanced security systems uses image segmentation to identify your face. It must be able to identify the unique features of your face so that any unwanted party cannot access your phone or system.
2. Number Plate Identification
Many traffic lights and cameras use number plate identification to charge fines and help with searches. Number plate identification technology allows a traffic system to recognize a car and get its ownership-related information. It uses image segmentation to separate a number plate and its information from the rest of the objects present in its vision. This technology has simplified the fining process considerably for governments.
3. Image-Based Search
Google and other search engines that offer image-based search facilities use image segmentation techniques to identify the objects present in your image and compare their findings with the relevant images they find to give you search results.
4. Medical Imaging
In the medical sector, we use image segmentation to locate and identify cancer cells, measure tissue volumes, run virtual surgery simulations, and perform intra-surgery navigation. Image segmentation has many applications in the medical sector. It helps in identifying affected areas and plan out treatments for the same.
Apart from these applications, image segmentation has uses in manufacturing, agriculture, security, and many other sectors. As our computer vision technologies become more advanced, the uses of image segmentation techniques will increase accordingly.
For example, some manufacturers have started using image segmentation techniques to find faulty products. Here, the algorithm would capture only the necessary components from the object’s image and classify them as faulty or optimal. This system reduces the risk of human errors and makes the testing process more efficient for the organization.
What are the Different Kinds of Image Segmentations?
Image segmentation is a very broad topic and has different ways to go about the process. We can classify image segmentation according to the following parameters:
1. Approach-Based Classification
In its most basic sense, image segmentation is object identification. An algorithm cannot classify the different components without identifying an object first. From simple to complicated implementations, all image segmentation work based on object identification.
So, we can classify image segmentation methods based on the way algorithms identify objects, which means, collecting similar pixels and separating them from dissimilar pixels. There are two approaches to performing this task:
Region-based Approach (Detecting Similarity)
In this method, you detect similar pixels in the image according to a selected threshold, region merging, region spreading, and region growing. Clustering and similar machine learning algorithms use this method to detect unknown features and attributes. Classification algorithms follow this approach for detecting features and separating image segments according to them.
Boundary-based Approach (Detecting Discontinuity)
The boundary-based approach is the opposite of the region-based approach for object identification. Unlike region-based detection, where you find pixels having similar features, you find pixels that are dissimilar to each other in the boundary-based approach. Point Detection, Edge Detection, Line Detection, and similar algorithms follow this method where they detect the edge of dissimilar pixels and separate them from the rest of the image accordingly.
2. Technique-Based Classification
Both of the approaches have their distinct image segmentation techniques. We use these techniques according to the kind of image we want to process and analyse and the kind of results we want to derive from it.
Based on these parameters, we can divide image segmentation algorithms into the following categories:
These algorithms require you to have the structural data of the image you are using. This includes the pixels, distributions, histograms, pixel density, colour distribution, and other relevant information. Then, you must have the structural data on the region you have to separate from the image.
You’ll need that information so your algorithm can identify the region. The algorithms we use for these implementations follow the region-based approach.
These algorithms require information about the discrete pixel values of the image, instead of the structure of the required section of the image. Due to this, they don’t require a lot of information to perform image segmentation and are useful when you have to work with multiple images. Machine learning algorithms such as K-means clustering and ANN algorithms fall in this category.
As you can guess from the name, these algorithms use both stochastic and structural methods. This means they use the structural information of the required region and the discrete pixel information of the whole image for performing image segmentation.
What are the Different Types of Image Segmentation Techniques?
Now that we know the different approaches and kinds of techniques for image segmentation, we can start discussing the specifics. Following are the primary types of image segmentation techniques:
- Thresholding Segmentation
- Edge-Based Segmentation
- Region-Based Segmentation
- Watershed Segmentation
- Clustering-Based Segmentation Algorithms
- Neural Networks for Segmentation
Let’s discuss each one of these techniques in detail to understand their properties, benefits, and limitations:
1. Thresholding Segmentation
The simplest method for segmentation in image processing is the threshold method. It divides the pixels in an image by comparing the pixel’s intensity with a specified value (threshold). It is useful when the required object has a higher intensity than the background (unnecessary parts).
You can consider the threshold value (T) to be a constant but it would only work if the image has very little noise (unnecessary information and data). You can keep the threshold value constant or dynamic according to your requirements.
The thresholding method converts a grey-scale image into a binary image by dividing it into two segments (required and not required sections).
According to the different threshold values, we can classify thresholding segmentation in the following categories:
In this method, you replace the image’s pixels with either white or black. Now, if the intensity of a pixel at a particular position is less than the threshold value, you’d replace it with black. On the other hand, if it’s higher than the threshold, you’d replace it with white. This is simple thresholding and is particularly suitable for beginners in image segmentation.
In simple thresholding, you picked a constant threshold value and used it to perform image segmentation. However, how do you determine that the value you chose was the right one? While the straightforward method for this is to test different values and choose one, it is not the most efficient one.
Take an image with a histogram having two peaks, one for the foreground and one for the background. By using Otsu binarization, you can take the approximate value of the middle of those peaks as your threshold value.
In Otsu binarization, you calculate the threshold value from the image’s histogram if the image is bimodal.
This process is quite popular for scanning documents, recognizing patterns, and removing unnecessary colours from a file. However, it has many limitations. You can’t use it for images that are not bimodal (images whose histograms have multiple peaks).
Having one constant threshold value might not be a suitable approach to take with every image. Different images have different backgrounds and conditions which affect their properties.
Thus, instead of using one constant threshold value for performing segmentation on the entire image, you can keep the threshold value variable. In this technique, you’ll keep different threshold values for different sections of an image.
This method works well with images that have varying lighting conditions. You’ll need to use an algorithm that segments the image into smaller sections and calculates the threshold value for each of them.
2. Edge-Based Segmentation
Edge-based segmentation is one of the most popular implementations of segmentation in image processing. It focuses on identifying the edges of different objects in an image. This is a crucial step as it helps you find the features of the various objects present in the image as edges contain a lot of information you can use.
Edge detection is widely popular because it helps you in removing unwanted and unnecessary information from the image. It reduces the image’s size considerably, making it easier to analyse the same.
Algorithms used in edge-based segmentation identify edges in an image according to the differences in texture, contrast, grey level, colour, saturation, and other properties. You can improve the quality of your results by connecting all the edges into edge chains that match the image borders more accurately.
There are many edge-based segmentation methods available. We can divide them into two categories:
Search-Based Edge Detection
Search-based edge detection methods focus on computing a measure of edge strength and look for local directional maxima of the gradient magnitude through a computed estimate of the edge’s local orientation.
Zero-Crossing Based Edge Detection
Zero-crossing based edge detection methods look for zero crossings in a derivative expression retrieved from the image to find the edges.
Typically, you’ll have to pre-process the image to remove unwanted noise and make it easier to detect edges. Canny, Prewitt, Deriche, and Roberts cross are some of the most popular edge detection operators. They make it easier to detect discontinuities and find the edges.
In edge-based detection, your goal is to get a partial segmentation minimum where you can group all the local edges into a binary image. In your newly created binary image, the edge chains must match the existing components of the image in question.
3. Region-Based Segmentation
Region-based segmentation algorithms divide the image into sections with similar features. These regions are only a group of pixels and the algorithm find these groups by first locating a seed point which could be a small section or a large portion of the input image.
After finding the seed points, a region-based segmentation algorithm would either add more pixels to them or shrink them so it can merge them with other seed points.
Based on these two methods, we can classify region-based segmentation into the following categories:
In this method, you start with a small set of pixels and then start iteratively merging more pixels according to particular similarity conditions. A region growing algorithm would pick an arbitrary seed pixel in the image, compare it with the neighbouring pixels and start increasing the region by finding matches to the seed point.
When a particular region can’t grow further, the algorithm will pick another seed pixel which might not belong to any existing region. One region can have too many attributes causing it to take over most of the image. To avoid such an error, region growing algorithms grow multiple regions at the same time.
You should use region growing algorithms for images that have a lot of noise as the noise would make it difficult to find edges or use thresholding algorithms.
Region Splitting and Merging
As the name suggests, a region splitting and merging focused method would perform two actions together – splitting and merging portions of the image.
It would first the image into regions that have similar attributes and merge the adjacent portions which are similar to one another. In region splitting, the algorithm considers the entire image while in region growth, the algorithm would focus on a particular point.
The region splitting and merging method follows a divide and conquer methodology. It divides the image into different portions and then matches them according to its predetermined conditions. Another name for the algorithms that perform this task is split-merge algorithms.
4. Watershed Segmentation
In image processing, a watershed is a transformation on a grayscale image. It refers to the geological watershed or a drainage divide. A watershed algorithm would handle the image as if it was a topographic map. It considers the brightness of a pixel as its height and finds the lines that run along the top of those ridges.
Watershed has many technical definitions and has several applications. Apart from identifying the ridges of the pixels, it focuses on defining basins (the opposite of ridges) and floods the basins with markers until they meet the watershed lines going through the ridges.
As basins have a lot of markers while the ridges don’t, the image gets divided into multiple regions according to the ‘height’ of every pixel.
The watershed method converts every image into a topographical map The watershed segmentation method would reflect the topography through the grey values of their pixels.
Now, a landscape with valleys and ridges would certainly have three-dimensional aspects. The watershed would consider the three-dimensional representation of the image and create regions accordingly, which are called “catchment basins”.
It has many applications in the medical sector such as MRI, medical imaging, etc. Watershed segmentation is a prominent part of medical image segmentation so if you want to enter that sector, you should focus on learning this method for segmentation in image processing particularly.
5. Clustering-Based Segmentation Algorithms
If you’ve studied classification algorithms, you must have come across clustering algorithms. They are unsupervised algorithms and help you in finding hidden data in the image that might not be visible to a normal vision. This hidden data includes information such as clusters, structures, shadings, etc.
As the name suggests, a clustering algorithm divides the image into clusters (disjoint groups) of pixels that have similar features. It would separate the data elements into clusters where the elements in a cluster are more similar in comparison to the elements present in other clusters.
Some of the popular clustering algorithms include fuzzy c-means (FCM), k-means, and improved k-means algorithms. In image segmentation, you’d mostly use the k-means clustering algorithm as it’s quite simple and efficient. On the other hand, the FCM algorithm puts the pixels in different classes according to their varying degrees of membership.
The most important clustering algorithms for segmentation in image processing are:
K-means is a simple unsupervised machine learning algorithm. It classifies an image through a specific number of clusters. It starts the process by dividing the image space into k pixels that represent k group centroids.
Then they assign each object to the group based on the distance between them and the centroid. When the algorithm has assigned all pixels to all the clusters, it can move and reassign the centroids.
Fuzzy C Means
With the fuzzy c-means clustering method, the pixels in the image can get clustered in multiple clusters. This means a pixel can belong to more than one cluster. However, every pixel would have varying levels of similarities with every cluster. The fuzzy c-means algorithm has an optimization function which affects the accuracy of your results.
Clustering algorithms can take care of most of your image segmentation needs. If you want to learn more about them, check out this guide on what is clustering and the different types of clustering algorithms.
6. Neural Networks for Segmentation
Perhaps you don’t want to do everything by yourself. Perhaps you want to have an AI do most of your tasks, which you can certainly do with neural networks for image segmentation.
You’d use AI to analyse an image and identify its different components such as faces, objects, text, etc. Convolutional Neural Networks are quite popular for image segmentation because they can identify and process image data much quickly and efficiently.
The experts at Facebook AI Research (FAIR) created a deep learning architecture called Mask R-CNN which can make a pixel-wise mask for every object present in an image. It is an enhanced version of the Faster R-CNN object detection architecture. The Faster R-CNN uses two pieces of data for every object in an image, the bounding box coordinates and the class of the object. With Mask R-CNN, you get an additional section in this process. Mask R-CNN outputs the object mask after performing the segmentation.
In this process, you’d first pass the input image to the ConvNet which generates the feature map for the image. Then the system applies the region proposal network (RPN) on the feature maps and generates the object proposals with their objectness scores.
After that, the Roi pooling layer gets applied to the proposals to bring them down to one size. In the final stage, the system passes the proposals to the connected layer for classification and generates the output with the bounding boxes for every object.
Learn More About Segmentation in Image Processing
Segmentation in image processing is certainly a broad topic with a lot of sub-sections. From various image segmentation techniques to algorithms, there’s a whole lot to learn in this discipline. With so much ground to cover, you can easily get lost and confused.
That’s why we recommend taking a course in machine learning and AI to overcome these issues. A course in this subject would teach you the basics as well as the advanced concepts of image segmentation and the related sectors. You will learn about the different machine learning concepts related to image processing, image segmentation, and computer vision.
AI & ML Courses will make it easier for you to learn all the relevant concepts because you’ll get a structured curriculum to study from. At upGrad, we offer multiple courses in machine learning.
Following are the primary courses we offer in machine learning and AI:
- PG Diploma in Machine Learning and AI
- Master of Science in Machine Learning & AI
- Executive Post-Graduate Programme in Machine Learning and Artificial Intelligence
- Master of Science in Machine Learning & Artificial Intelligence
- PG Certification in Machine Learning and Deep Learning
- PG Certification in Machine Learning and NLP
All of these courses allow you to learn from industry experts who resolve your doubts and answer your questions in live sessions. You will study online, which means you wouldn’t have to go anywhere or disturb your job while taking these courses.
These courses give you access to upGrad’s Student Success Corner which offers many additional advantages including personalized resume feedback, interview preparation, and career counselling. By the end of the course, you’ll be a job-ready AI/ML professional equipped with all the necessary soft and hard skills.
Image segmentation is certainly a complicated and advanced topic. All the various image segmentation techniques we discussed in this article have their specific advantages and limitations. By getting familiar with them, you will get an idea of where you should use one and where you should avoid using the other.
With all the learnt skills you can get active on other competitive platforms as well to test your skills and get even more hands-on.