Adversarial Machine Learning: Concepts, Types of Attacks, Strategies & Defenses

The exponential progress of the previous decades has propelled modern technological advancements in today’s world. We are currently a part of the ongoing ‘Industry 4.0’, at the centre of which are technologies like AI and ML. This industrial revolution involves a global transition towards scientific research and innovation in technologies of neural networks, Machine Learning, and Artificial Intelligence, IoT, digitisation, and much more.

They provide us with an array of benefits in sectors like e-commerce, manufacturing, sustainability, supply chain management, etc. The global market for AI/ML is expected to surpass USD 266.92 billion by 2027 and continues to be a preferred choice of career for graduates everywhere.  

While the adaptation of these technologies is paving the way for the future, we are unprepared for events like Adversarial Machine Learning (AML) attacks. Machine Learning systems that are designed using coding languages like SML, OCaml, F#, etc., rely on programmable codes that are integrated throughout the system.

External AML attacks performed by experienced hackers pose a threat to the integrity and accuracy of these ML systems. Slight modifications to the input data set can cause the ML algorithm to misclassify the feed, and thus reduce the reliability of these systems.

To equip yourself with the right resources for designing systems that can withstand such AML attacks, enrol in a PG Diploma in Machine Learning offered by upGrad and IIIT Bangalore.

Concepts Centred on Adversarial Machine Learning

Before we delve into the topic of AML, let us establish the definitions of some of the basic concepts of this domain:

  • Artificial Intelligence refers to the ability of a computing system to perform logic, planning, problem-solving, simulation, or other kinds of tasks. An AI mimics human intelligence due to the information fed into it by using Machine Learning techniques.
  • Machine Learning employs well-defined algorithms and statistical models for computer systems, which rely on performing tasks based on patterns and inferences. They are designed to execute these tasks without explicit instructions, and instead use predefined information from neural networks.
  • Neural Networks are inspired by the biological functioning of a brain’s neurons, which are used for systematically programming the observational data into a Deep Learning model. This programmed data helps decipher, distinguish, and process input data into coded information to facilitate Deep Learning.
  • Deep Learning uses multiple neural networks and ML techniques to process unstructured and raw input data into well-defined instructions. These instructions facilitate building multi-layered algorithms automatically through its representation/feature learning in an unsupervised manner.
  • Adversarial Machine Learning is a unique ML technique that supplies deceptive inputs to cause malfunction within a Machine Learning model. Adversarial Machine Learning exploits vulnerabilities within the test data of the intrinsic ML algorithms that make up a neural network. An AML attack can compromise resultant outcomes and pose a direct threat to the usefulness of the ML system.

To learn the key concepts of ML, such as Adversarial Machine Learning, in-depth, enrol for the Masters of Science (M.Sc) in Machine Learning & AI from upGrad.

Types of AML Attacks 

Adversarial Machine Learning attacks are categorised based on three types of methodologies.

They are:

1. Influence on Classifier

Machine Learning systems classify the input data based on a classifier. If an attacker can disrupt the classification phase by modifying the classifier itself, it can result in the ML system losing its credibility. Since these classifiers are integral to identifying data, tampering with the classification mechanism can reveal vulnerabilities that can be exploited by AMLs.

2. Security Violation

During the learning stages of an ML system, the programmer defines the data that is to be considered legitimate. If legitimate input data is improperly identified as malicious, or if malicious data is provided as input data during an AML attack, the rejection can be termed as a security violation.

3. Specificity

While specific targeted attacks allow specific intrusions/disruptions, indiscriminate attacks add to the randomness within the input data and create disruptions through decreased performance/failure to classify.

AML attacks and their categories are conceptually branched out of the Machine Learning domain. Due to the rising demand for ML systems, nearly 2.3 million job vacancies are available for ML and AI engineers, according to Gartner.[2]  You can read more about how Machine Learning Engineering can be a rewarding career in 2021.

Adversarial Machine Learning Strategies

To further define the goal of the adversary, their prior knowledge of the system to be attacked and the level of possible manipulation of data components can assist in defining Adversarial Machine Learning strategies.

They are: 

1. Evasion

ML algorithms identify and sort the input data set based on certain predefined conditions and calculated parameters. The evasion type of AML attack tends to evade these parameters used by algorithms to detect an attack. This is carried out by modifying the samples in a manner that can avoid detection and misclassify them as legitimate input.

They do not modify the algorithm but instead spoof the input by various methods so that it escapes the detection mechanism. For example, anti-spam filters that analyse the text of an email are evaded with the use of images that have embedded text of malware code/links. 

2. Model extraction

Also known as ‘model stealing’; this type of AML attacks is carried out on ML systems to extract the initial training data used for building the system. These attacks are essentially capable of reconstructing the model of that Machine Learning system, which can compromise its efficacy. If the system holds confidential data, or if the nature of the ML itself is proprietary/sensitive, then the attacker could use it for their benefit or disrupt it.

3. Poisoning

This type of Adversarial Machine Learning attack involves disruption of the training data. Since ML systems are retrained using data collected during their operations, any contamination caused by injecting samples of malicious data can facilitate an AML attack. For poisoning data, an attacker needs access to the source code of that ML and retrains it to accept incorrect data, thus inhibiting the functioning of the system.

Proper knowledge of these Adversarial Machine Learning attack strategies can enable a programmer to avoid such attacks during operation. If you need hands-on training for designing ML systems that can withstand AML attacks, enrol for the Master’s in Machine Learning and AI offered by upGrad.

Specific Attack Types

Specific attack types that can target Deep Learning systems, along with conventional ML systems like linear regression and ‘support-vector machines’, can threaten the integrity of these systems. They are: 

  • Adversarial examples, such as FMCG, PGD, C&W, and patch attacks, cause the machine to misclassify, as they appear normal to the user. Specific ‘noise’ is used within the attack code to cause malfunction of the classifiers.
  • Backdoor/Trojan attacks overload an ML system by bombarding it with irrelevant and self-replicating data that prevents it from optimum functioning. These Adversarial Machine Learning attacks are difficult to protect from, as they exploit the loopholes that exist within the machine.
  • Model Inversion rewrites classifiers to function in an opposite manner to which they were originally intended. This inversion prevents the machine from performing its basic tasks due to the changes applied to its inherent learning model.
  • Membership Inference Attacks (MIAs) can be applied to SL (supervised learning) and GANs (Generative Adversarial Networks). These attacks rely on the differences between the data sets of initial training data and external samples that pose a privacy threat. With access to the black-box and its data record, inference models can predict whether the sample was present in the training input or not.

To protect ML systems from these types of attacks, ML programmers and engineers are employed across all the major MNCs. Indian MNCs that host their R&D centres to encourage innovation in Machine Learning, offer salaries ranging from 15 to 20 Lakh INR per annum.[3] To learn more about this domain and secure a hefty salary as an ML engineer, enrol in an Advanced Certification in Machine Learning and Cloud hosted by upGrad and IIT Madras.

Defences Against AMLs

To defend against such Adversarial Machine Learning attacks, experts suggest that programmers rely on a multi-step approach. These steps would serve as countermeasures to the conventional AML attacks described above. These steps are:

  • Simulation: Simulating attacks according to the possible attack strategies of the attacker can reveal loopholes. Identifying them through these simulations can prevent AML attacks from having an impact on the system.
  • Modelling: Estimating the capabilities and potential goals of attackers can provide an opportunity to prevent AML attacks. This is done by creating different models of the same ML system that can withstand these attacks. 
  • Impact evaluation: This type of defence evaluates the total impact an attacker can have over the system, thus ensuring preparation in the event of such an attack.
  • Information laundering: By modifying the information extracted by the attacker, this type of defence can render the attack pointless. When the extracted model contains purposely placed discrepancies, the attacker cannot recreate the stolen model.

Examples of AMLs

Various domains within our modern technologies are directly under the threat of Adversarial Machine Learning attacks. Since these technologies rely on pre-programmed ML systems, they could be exploited by people with malicious intentions. Some of the typical examples of AML attacks include:

1. Spam filtering: By purposely misspelt ‘bad’ words that identify spam or the addition of ‘good’ words that prevent identification.

2. Computer security: By hiding malware code within cookie data or mislead digital signatures to bypass security checks.

3. Biometrics: By faking biometric traits that are converted to digital information for identification purposes.


As the fields of Machine Learning and Artificial Intelligence continue to expand, their applications increase across sectors like automation, neural networks, and data security. Adversarial Machine Learning will always be significant for the ethical purpose of protecting ML systems and preserving their integrity. 

If you are interested to know more about machine learning, check out our Executive PG Programme in Machine Learning and AI program which is designed for working professionals and provide 30+ case studies & assignments, 25+ industry mentorship sessions, 5+ practical hands-on capstone projects, more than 450 hours of rigorous training & job placement assistance with top firms.

Lead the AI Driven Technological Revolution


0 replies on “Adversarial Machine Learning: Concepts, Types of Attacks, Strategies & Defenses”

Accelerate Your Career with upGrad

Our Popular Machine Learning Course