Homebreadcumb forward arrow iconBlogbreadcumb forward arrow iconArtificial Intelligencebreadcumb forward arrow iconOne-Shot Learning with Siamese Network [For Facial Recognition]

One-Shot Learning with Siamese Network [For Facial Recognition]

Last updated:
17th Jun, 2023
Read Time
7 Mins
share image icon
In this article
Chevron in toc
View All
One-Shot Learning with Siamese Network [For Facial Recognition]

The following article talks about the need for using One-shot learning along with its variations and drawbacks.

To begin with, in order to train any deep learning model, we need a large amount of data so that our model performs the desired prediction or classification task efficiently. For instance, detecting a dog from images will require you to train a neural network model on hundreds and thousands of dog and non-dog images for it to accurately distinguish one from the other. However, this neural network model will fail to work if it is trained on one or very few training data. 

With the lack of data, extracting relevant features at different layers becomes difficult. The model will not be able to generalize well between different classes thereby affecting its overall performance.

For illustration, consider the example of facial recognition at an airport. In this, we do not have the liberty to train our model of hundreds and thousands of images of each person containing different expressions, background lighting et al. With more than thousands of passengers arriving daily it is an impossible task! Besides, storing such a huge chunk of data adds up to the cost. 

Ads of upGrad blog

To tackle the above problem, we use a technique in which classification or categorization tasks can be achieved with one or a few examples to classify many new examples. This technique is called One-shot learning. 

In recent years One-shot learning technology is being used extensively in facial recognition and passport checks. The concept being used is- The model takes input 2 images; one being the image from the passport and the other being the image of the person looking at the camera. The model then outputs a value which is the similarity between the 2 images. If the value of the output is low then the two images are similar else they are different.

Siamese Network

The architecture used for One-shot learning is called the Siamese Network. This architecture comprises two parallel neural networks with each taking different input. The output of the model is a value or a similarity index which indicates whether the two input images are alike or not. A value below a pre-defined threshold corresponds to the high similarity between the two images and visa versa. 

When the images are passed a series of Convolutional layers, max-pooling layers, and fully connected layers what we achieve is a vector that encodes the features of the images. Here because we input two images, two vectors encompassing the features of the input images will be generated. The value which we were talking about is the distance between the two feature vectors which can be calculated by finding the norm of the difference between the two vectors. 

Advantages and Disadvantages of Siamese Networks

As one of the matching networks for one shot learning, when working with SNN, you should remember these pros and cons.

Advantages of SNNs

  • Siamese networks demonstrate much higher speed and accuracy when identifying faces, images, and more such similarities than other neural networks.
  • You do not have to retrain Siamese networks to detect new classes after initially training them to work with large datasets. That is not possible with other neural networks, which have to be completely retrained.
  • Models can display improved generalization performance when both outputs are based on the same parameters, especially when the model is dealing with objects that are similar but not identical.

Trending Machine Learning Skills

Drawbacks of SNNs

  • The main challenge you will face with Siamese networks is that it needs higher computational power to work on twice as many operations required to train two models compared to other CNNs.
  • Siamese networks have a huge memory requirement.
  • SNNs also take much longer to train since they learn by comparing pairs of items.

Triplet loss function

As the name suggests, to train the model we require three images- one anchor (A) image, one positive (P), and one negative (N) image. Since two inputs can be provided to the model, an anchor image with either a positive or negative image is given. The model learns the parameter in such a fashion that the distance between the anchor image and the positive image is low while the distance between the anchor image and the negative image is high. 

The constructive loss function penalizes the model if the distance between A and N is low or A and P is high, while it encourages the model or learns features when the distance between A and N is high and A and P is low.

To understand more about the anchor, positive and negative images let’s consider the previous example of that at an airport. In such a case, the anchor image will be your image when you look at the camera, the positive image will be the one on your passport photo and the negative image will be a random image of a passenger present at the airport. 

Whenever we train a Siaseme network we provide it with the APN trios (Anchor, positive and negative) images. Creating this dataset is much easier and would require fewer images to train. 

Learn ML Course from the World’s top Universities. Earn Masters, Executive PGP, or Advanced Certificate Programs to fast-track your career

Limitations of One-shot learning

One-shot learning is still a mature machine learning algorithm and does possess some limitations. For instance, the model will not work well if the input image has some modifications- a person wearing a hat, sunglasses et al. Further, a model that is trained for one application cannot be generalized for another application. 

Moving on let’s see a few variations of One-shot learning which entails Zero-shot learning and Few-shot learning.

Zero-shot learning

Zero-shot learning is the ability of the model to identify new or unseen labeled data while being trained on seen data and knowing the semantic features of new or unseen data. For instance, a child who has seen a cat can identify it by its distinct features. Moreover, if the child is aware that the dog’s bark and possesses more solid characteristics than a cat, then the child would have no problem in recognizing the dog.

To conclude, we can say that ZSL recognition functions in a manner that takes into account the labeled training set of seen classes coupled with the knowledge about how each unseen class is semantically related to the seen classes.

Few Shot Learning

In Few shot learning, models require a very short amount of data to make predictions, compared to the large amounts that other models require of learning. It is a meta-learning form involving training on multiple related tasks during the meta-training phase. It enables the model to effectively generalize when faced with new data and only a few examples.

Few-shot learning is used in computer vision, natural language processing, robotics, and audio processing.

How is Few Shot Learning Helpful?

There are several reasons why Few shot learning is helpful:

  • It can be used when you want to reduce the data collection as it does not need much data to train the model. It also helps reduce the cost of data collection and computation.
  • In case of insufficient data, you can use Few-shot learning to make accurate predictions. Other machine learning tools, whether supervised or unsupervised, find it difficult to do without sufficient data.
  • Judging by a few examples, humans can categorize various handwritten characters, which is difficult for machines to do since they need large amounts of data to train. Few-shot learning can achieve the same feat as humans, owing to the small data it can work with.
  • Through the use of few-shot learning, machines can learn about rare diseases. These machines can classify anomalies with minimal data by employing computer vision models.

N-shot learning

As the name suggests, in N shot learning we will have n labeled data of each class available for training. The model is trained on K classes each containing n labeled data. After extracting relevant features and patterns the model has to categorize a new unlabelled image into one of the K classes. They use Matching networks that work on the nearest neighbors based approach trained fully end to end. 

Main Difference Between One-Shot, Few-Shot and Zero-Shot Learning 

One shot learning requires one labeled example for each new class. Few-shot learning requires a small number of examples for each new class, and zero-shot learning requires no labeled example for a new class.

Few-shot learning is a variation of one-shot learning since it requires more than one training image.

Zero-shot learning aims to classify unknown classes without any training data. The way it learns here is by using the image’s metadata or important information. This method mimics how humans learn. For example, if you read a detailed description of an elephant in a book, you will easily recognize it in real life or a photo.

Popular AI and ML Blogs & Free Courses


Ads of upGrad blog

In conclusion, the field of One-shot learning and its counterparts have immense potential to solve some of the challenging problems. Though, being a relatively new area of research, it is making fast progress, and researchers are working trying to bridge the gap between machines and humans. 

With this, we have come to an end of this post, I hope you enjoyed reading it. 

If you’re interested to learn more about machine learning, check out IIIT-B & upGrad’s PG Diploma in Machine Learning & AI which is designed for working professionals and offers 450+ hours of rigorous training, 30+ case studies & assignments, IIIT-B Alumni status, 5+ practical hands-on capstone projects & job assistance with top firms.


Pavan Vadapalli

Blog Author
Director of Engineering @ upGrad. Motivated to leverage technology to solve problems. Seasoned leader for startups and fast moving orgs. Working on solving problems of scale and long term technology strategy.
Get Free Consultation

Selectcaret down icon
Select Area of interestcaret down icon
Select Work Experiencecaret down icon
By clicking 'Submit' you Agree to  
UpGrad's Terms & Conditions

Our Popular Machine Learning Course

Explore Free Courses

Suggested Blogs

45+ Best Machine Learning Project Ideas For Beginners [2024]
Summary: In this Article, you will learn Stock Prices Predictor Sports Predictor Develop A Sentiment Analyzer Enhance Healthcare Prepare ML Algorith
Read More

by Jaideep Khare

21 May 2024

Top 15 IoT Interview Questions & Answers 2024 – For Beginners & Experienced
These days, the minute you indulge in any technology-oriented discussion, interview questions on cloud computing come up in some form or the other. Th
Read More

by Kechit Goyal

19 May 2024

40 Best IoT Project Ideas & Topics For Beginners 2024 [Latest]
In this article, you will learn the 40Exciting IoT Project Ideas & Topics. Take a glimpse at the project ideas listed below. Best Simple IoT Proje
Read More

by Kechit Goyal

19 May 2024

Top 22 Artificial Intelligence Project Ideas & Topics for Beginners [2024]
In this article, you will learn the 22 AI project ideas & Topics. Take a glimpse below. Best AI Project Ideas & Topics Predict Housing Price
Read More

by Pavan Vadapalli

18 May 2024

Image Segmentation Techniques [Step By Step Implementation]
What do you see first when you look at your selfie? Your face, right? You can spot your face because your brain is capable of identifying your face an
Read More

by Pavan Vadapalli

16 May 2024

6 Types of Regression Models in Machine Learning You Should Know About
Introduction Linear regression and logistic regression are two types of regression analysis techniques that are used to solve the regression problem
Read More

by Pavan Vadapalli

16 May 2024

How to Make a Chatbot in Python Step By Step [With Source Code]
Creating a chatbot in Python is an essential skill for modern developers looking to enhance user interaction and automate responses within application
Read More

by Kechit Goyal

13 May 2024

Artificial Intelligence course fees
Artificial intelligence (AI) was one of the most used words in 2023, which emphasizes how important and widespread this technology has become. If you
Read More

by venkatesh Rajanala

29 Feb 2024

Artificial Intelligence in Banking 2024: Examples & Challenges
Introduction Millennials and their changing preferences have led to a wide-scale disruption of daily processes in many industries and a simultaneous g
Read More

by Pavan Vadapalli

27 Feb 2024

Schedule 1:1 free counsellingTalk to Career Expert
footer sticky close icon