What is Agglomerative Clustering? Definition, Steps & Examples
By Mukesh Kumar
Updated on May 07, 2025 | 19 min read | 1.3k views
Share:
For working professionals
For fresh graduates
More
By Mukesh Kumar
Updated on May 07, 2025 | 19 min read | 1.3k views
Share:
Table of Contents
Did you know? Agglomerative Clustering mimics natural hierarchical structures, such as the way species evolve and form nested groups. Instead of requiring a predefined number of clusters, it progressively merges data points based on their similarity, making it an excellent method for uncovering relationships in complex data without prior assumptions.
Agglomerative Clustering is a hierarchical machine learning technique that groups data points by successively merging the closest pairs based on a distance metric.
Unlike k-means, which needs you to specify the number of clusters, Agglomerative Clustering builds a tree-like structure (dendrogram) to show how the data naturally groups together, making it perfect for discovering complex patterns.
In this tutorial, we’ll dive into the core principles of Agglomerative Clustering, discuss the benefits of this hierarchical clustering method, and walk through its implementation to uncover hidden patterns in datasets.
Improve your machine learning skills with our online AI and ML courses — take the next step in your learning journey!
Agglomerative Clustering is ideal for scenarios where the number of clusters is not known in advance, such as in exploratory data analysis. Introduced as part of hierarchical clustering techniques, it has been widely used in fields like biology, marketing, and image processing due to its ability to uncover complex patterns and relationships in datasets.
For example, imagine you have a dataset of animals with features like size, weight, and habitat. Using Agglomerative Clustering, the algorithm would start by treating each animal as its own group. It would then progressively merge the most similar animals based on their features.
Eventually, the method would form a tree-like structure (dendrogram), showing how animals with similar characteristics, like size and weight, are grouped together. This will help you see which species are most closely related based on the data.
Machine learning professionals skilled in data mining techniques like agglomerative clustering are in high demand due to their ability to handle complex data. If you're looking to develop skills in AI and ML, here are some top-rated courses to help you get there:
Agglomerative Clustering builds a hierarchical cluster structure by progressively merging the closest data points or clusters based on a chosen distance metric, such as Euclidean distance. The algorithm uses a linkage criterion to determine how clusters should be merged, typically using methods like single linkage, complete linkage, or average linkage.
1. Linkage Criterion
The linkage criterion defines how the distance between clusters is calculated and impacts how clusters are formed. Here's a brief breakdown of the most common linkage methods:
2. Dendrogram
As the algorithm progresses, it creates a tree-like diagram called a dendrogram that visualizes the merging process. The dendrogram shows the hierarchy of clusters, allowing you to decide on the appropriate number of clusters based on where you cut the tree.
The process continues until all data points are grouped into a single cluster, but you can stop earlier by selecting a threshold or number of clusters that fits your needs.
Also Read: What is Clustering in Machine Learning and Different Types of Clustering Methods
Now that you know what is agglomerative clustering in ML, let’s understand how it works with an agglomerative clustering example.
What makes this method unique is that it doesn’t require you to specify the number of clusters in advance. Instead, it starts with each data point as its own cluster and progressively merges them based on their similarities.
As the algorithm works its way through the data, it builds a hierarchical tree (dendrogram) that illustrates how different clusters are related, helping you explore complex relationships within your dataset. This method is especially useful for discovering nested clusters, making it ideal for tasks like customer segmentation or image analysis.
Here's how it works, step by step, using a simple example to make it easier to understand:
Imagine you're organizing a group of friends into small groups based on how similar their interests are. You start with each person being their own group. So, if you have 6 friends (A, B, C, D, E, F), you start with 6 clusters: {A}, {B}, {C}, {D}, {E}, and {F}.
Next, you figure out how similar (or dissimilar) each person is to the others. This is done by calculating the distance between each pair of points (or people, in this case). For example:
For simplicity, let’s say the similarity between A and B is high (they both like the same movies), but A and D have very different tastes, so their similarity is low. This distance calculation can be done using metrics like Euclidean distance or Manhattan distance, depending on the data you're working with.
Once you have the distances, you merge the two most similar groups. In our case, let’s say A and B have the smallest distance (they like the same movies), so you combine them into one group: {A, B}. Now, your clusters look like this: {A, B}, {C}, {D}, {E}, {F}.
Also Read: What is Cluster Analysis in Data Mining? Methods, Benefits, and More
Now, you repeat this process. You’ll recalculate the distances between all the clusters you have so far and merge the next two closest ones. This could mean merging:
This step continues, and with each iteration, the number of clusters decreases as similar clusters are combined into larger groups.
As you keep merging clusters, you can visualize this process with a dendrogram, which looks like a tree showing how each group merges with another.
The height of each branch shows how similar or different the clusters are when they merged. The lower the branch, the more similar the groups.
Eventually, you’ll have one large cluster with everyone in it, but you don’t need to go that far. You can stop the merging process when you reach the number of clusters you want.
For example, if you want to group your friends into 2 main clusters, you stop merging when you have two large groups, say {A, B, C} and {D, E, F}.
Let’s consider another example, where you are working for a marketing company. You have data about customers, including their spending habits. You want to group them based on their similarity to personalize offers.
Step 1: Each customer is its own cluster.
Step 2: You calculate the distances between each customer based on how much they spend on different products.
Step 3: Customers who have similar spending patterns are grouped together (e.g., Customer A and B who both spend a lot on electronics).
Step 4: This process continues, combining customers with similar spending habits until you end up with a few groups like "Electronics Lovers" and "Clothing Shoppers."
Step 5: You visualize the clusters with a dendrogram, showing how the groups formed and at what point they merged.
Step 6: Finally, you decide you want 3 customer groups, so you stop the merging process when you get the desired number.
By following this process, you can uncover natural groupings in your data, making it easier to tailor your marketing efforts to different customer segments.
If you want to understand how to work with clustering methods in ML, upGrad’s Executive Diploma in Machine Learning and AI can help you. With a strong hands-on approach, this program ensures that you apply theoretical knowledge to real-world challenges, preparing you for high-demand roles like AI Engineer and Machine Learning Specialist.
Also Read: Clustering vs Classification: What is Clustering & Classification
Now that you know what is agglomerative clustering and how exactly it works, let’s look at how you can implement it in real-life situations using a Python agglomerative clustering example.
Let’s implement Agglomerative Clustering in Python using a more complex, real-life example, Customer Segmentation based on their purchasing behavior.
Imagine you work for a retail company and have data on customers' spending habits for two products: Product A and Product B. You want to group customers into segments to better understand their preferences, so you can tailor marketing strategies.
Here are the different steps:
1. Import Libraries
First, you need to import the necessary libraries for data handling, clustering, and visualization:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import AgglomerativeClustering
from scipy.cluster.hierarchy import dendrogram, linkage
You can get a better understanding of Python libraries with upGrad’s Learn Python Libraries: NumPy, Matplotlib & Pandas. Learn how to manipulate data using NumPy, visualize insights with Matplotlib, and analyze datasets with Pandas.
Also Read: Python Modules: Explore 20+ Essential Modules and Best Practices
2. Prepare the Data
Here, the data represents the amount spent by customers on Product A and Product B. The goal is to group customers based on their spending patterns.
# Example data: Spending on Product A and Product B for each customer
data = np.array([[500, 200], # Customer 1 (Spends $500 on Product A, $200 on Product B)
[550, 180], # Customer 2
[600, 250], # Customer 3
[150, 80], # Customer 4
[200, 100], # Customer 5
[650, 300]]) # Customer 6
3. Run Agglomerative Clustering
We’ll perform Agglomerative Clustering to group customers based on their spending behaviors. In this case, let’s say we want to divide customers into 2 segments.
# Running Agglomerative Clustering
agglomerative_model = AgglomerativeClustering(n_clusters=2) # We want to split into 2 clusters
clusters = agglomerative_model.fit_predict(data)
# Display the clusters
print("Cluster assignments:", clusters)
This assigns each customer to one of the two clusters based on their spending.
4. Visualize the Clusters
After performing the clustering, we can visualize the clusters to understand how customers are grouped:
# Visualizing the clusters
plt.scatter(data[:, 0], data[:, 1], c=clusters, cmap='viridis')
plt.title("Agglomerative Clustering of Customers")
plt.xlabel("Spending on Product A")
plt.ylabel("Spending on Product B")
plt.show()
Output:
This scatter plot will show how customers are grouped based on their spending on the two products.
Also Read: Matplotlib in Python: Explained Various Plots with Examples
5. Visualize the Dendrogram (Optional)
To better understand how the clusters were formed, we can visualize the dendrogram, which shows the hierarchical relationship between the customers.
# Linkage matrix for dendrogram
Z = linkage(data, method='ward')
# Plotting the dendrogram
plt.figure(figsize=(10, 6))
dendrogram(Z)
plt.title("Dendrogram for Customer Segmentation")
plt.xlabel("Customers")
plt.ylabel("Distance")
plt.show()
The dendrogram will show how customers were progressively merged into clusters based on their spending patterns.
Output:
By running Agglomerative Clustering on this data, you can identify two distinct groups of customers:
This clustering helps businesses target different groups with tailored marketing strategies, such as offering discounts on higher-end products to the high-spending group or promoting budget-friendly products to the low-spending group.
Here are some tips for effective execution:
By following these steps, you can implement Agglomerative Clustering to segment customers based on their behaviors, which is key for creating targeted marketing strategies in businesses.
Are you a full-stack developer wanting to integrate AI into Python programming workflow? upGrad’s AI-Driven Full-Stack Development bootcamp can help you. You’ll learn how to build AI-powered software using OpenAI, GitHub Copilot, Bolt AI & more.
Also Read: Hierarchical Clustering in Python [Concepts and Analysis]
Now that you have a better understanding of how to implement Agglomerative Clustering in Python, let’s look at some of its advantages and drawbacks.
Agglomerative Clustering is especially useful for tasks like customer segmentation, document clustering, and hierarchical clustering of genes or images, where understanding the hierarchy of clusters is important.
However, the method also comes with limitations. For larger datasets, the computational complexity increases significantly, as the algorithm must compute distances between all data points and iteratively merge them.
Here’s a breakdown of its key benefits and limitations:
Benefits |
Limitations |
Unlike k-means, you don’t need to define the number of clusters beforehand. This makes it highly adaptable to exploratory data analysis. | Agglomerative clustering has a time complexity of O(n^3) and a space complexity of O(n^2), which can make it slow and memory-intensive for large datasets. |
The dendrogram generated during clustering shows how clusters are merged, which is helpful for understanding the relationships between groups. | Agglomerative clustering can be highly sensitive to outliers, which may distort the clusters and lead to poor results. |
It offers multiple linkage methods (e.g., single, complete, average, ward), allowing you to fine-tune the clustering process based on the data. | For large datasets, agglomerative clustering becomes less efficient as the number of data points increases, leading to longer computation times. |
The dendrogram provides a clear visualization of how clusters are formed, helping with decision-making about where to "cut" the tree. | The need to compute and store a distance matrix for all pairs of data points increases memory usage, which can be problematic for high-dimensional data. |
To make the most out of Affinity Propagation, here are some best practices:
By following these best practices, you can ensure more effective use of Agglomerative Clustering and obtain meaningful insights from your data.
If you want to understand how to work with AI and ML, upGrad’s Executive Diploma in Machine Learning and AI can help you. With a strong hands-on approach, this AI ML program ensures that you apply theoretical knowledge to real-world challenges, preparing you for high-demand roles like AI Engineer and Machine Learning Specialist.
Also Read: Top 10 Dimensionality Reduction Techniques for Machine Learning(ML) in 2025
Next, let’s look at some of the real-life applications of Agglomerative Clustering across different industries.
Agglomerative Clustering is widely applied in various industries due to its hierarchical approach to grouping data. It helps organizations uncover patterns in data without needing to specify the number of clusters in advance, making it ideal for complex, high-dimensional datasets.
Below are five real-life examples where Agglomerative Clustering has been successfully implemented:
1. Customer Segmentation in Retail and Marketing
Retail businesses and marketing teams use Agglomerative Clustering to group customers based on their purchasing behavior, demographics, or browsing patterns. Unlike other clustering methods, Agglomerative Clustering doesn’t require predefining the number of customer segments, which helps businesses adapt to changing customer preferences.
Outcome: By identifying meaningful customer segments, businesses can create more personalized marketing campaigns, increase customer satisfaction, and optimize resource allocation.
Also Read: Image Segmentation Techniques [Step By Step Implementation]
2. Document Clustering for Content Management
In industries such as publishing, legal services, or research, documents are often clustered based on their content, such as topic, keyword frequency, or writing style. Agglomerative Clustering helps group similar documents into clusters without the need to specify the number of groups beforehand.
Outcome: This enables easier document retrieval, knowledge management, and the creation of topic-based document collections, improving efficiency and accessibility.
3. Gene Expression Analysis in Bioinformatics
In genomics, clustering genes based on their expression profiles is a challenging task, especially when the number of gene groups is unknown. Agglomerative Clustering is used to group genes with similar patterns, aiding in the identification of genes associated with diseases or biological functions.
Outcome: This allows researchers to uncover new insights into genetic relationships, disease markers, and therapeutic targets, advancing the field of personalized medicine.
4. Image Segmentation in Computer Vision
Agglomerative Clustering is used for image segmentation, where the goal is to segment an image into distinct regions based on pixel similarity (e.g., color, intensity, texture). This method automatically determines the number of regions or segments required, making it ideal for tasks where the number of segments is not known in advance.
Outcome: It provides more accurate image segmentation for tasks such as object detection, medical image analysis, or satellite image interpretation, enhancing the precision of computer vision models.
5. Anomaly Detection in Cybersecurity
In cybersecurity, Agglomerative Clustering is used to detect abnormal patterns in network traffic or user behavior. By grouping normal activities into clusters, the algorithm can identify outliers that deviate from expected behavior, such as potential security threats or fraud.
Outcome: It enables real-time detection of anomalies, improving the security and integrity of IT systems by swiftly identifying unusual behavior that could indicate a cyberattack.
If you need a better understanding of cybersecurity, upGrad’s free Fundamentals of Cybersecurity course can help you. You will learn key concepts, current challenges, and important terminology to protect systems and data.
Also Read: Anomaly Detection With Machine Learning: What You Need To Know?
To solidify your understanding of the Agglomerative Clustering in machine learning, test your knowledge with a quiz. It’ll help reinforce the concepts discussed throughout the tutorial and ensure you're ready to apply them in your projects.
Assess your understanding of Agglomerative Clustering, its components, advantages, limitations, and best practices by answering the following multiple-choice questions.
Test your knowledge now!
1. What is the key feature of Agglomerative Clustering?
a) It requires the number of clusters to be specified in advance
b) It automatically determines the number of clusters based on the data
c) It builds a hierarchical tree structure (dendrogram)
d) It can only be used for numerical data
2. Which linkage method is commonly used in Agglomerative Clustering to minimize variance within clusters?
a) Single linkage
b) Complete linkage
c) Ward's linkage
d) Average linkage
3. What is the primary limitation of Agglomerative Clustering?
a) It cannot handle categorical data
b) It requires the number of clusters to be pre-defined
c) It is computationally expensive for large datasets
d) It does not produce meaningful hierarchical structures
4. How does Agglomerative Clustering form clusters?
a) By grouping data points based on predefined clusters
b) By merging the most similar clusters iteratively
c) By selecting random data points as cluster centers
d) By partitioning the data into equal-sized clusters
5. What is the role of the dendrogram in Agglomerative Clustering?
a) It helps to visualize how clusters are split into smaller sub-clusters
b) It shows the number of clusters needed in the dataset
c) It visualizes the hierarchical relationship between clusters
d) It directly determines the optimal number of clusters
6. Which of the following is a benefit of Agglomerative Clustering?
a) It is suitable for clustering large datasets with a high number of features
b) It can be used to automatically select the number of clusters
c) It does not require prior knowledge of the number of clusters
d) It works well with non-hierarchical datasets
7. What is the time complexity of Agglomerative Clustering in terms of number of data points (n)?
a) O(n log n)
b) O(n^2)
c) O(n^3)
d) O(n)
8. Which of the following is a suitable application for Agglomerative Clustering?
a) Image segmentation with fixed number of segments
b) Customer segmentation where the number of groups is unknown
c) Data that requires dynamic cluster size adjustments
d) When the data has well-defined clusters with linear separability
9. What does the “cutting” of the dendrogram signify in Agglomerative Clustering?
a) The final merging of all clusters into one
b) The determination of the number of clusters in the data
c) The separation of a cluster into multiple sub-clusters
d) The visualization of the hierarchical relationships of the data
10. What is a recommended practice when using Agglomerative Clustering on large datasets?
a) Always use single linkage for better results
b) Scale your data to ensure equal importance across all features
c) Predefine the number of clusters for faster convergence
d) Use Agglomerative Clustering only for small datasets with known structures
This quiz will help you evaluate your understanding of Agglomerative Clustering, its strengths, limitations, and applications. By answering these questions, you can ensure you’re ready to implement the method effectively in real-world scenarios.
Also Read: 5 Breakthrough Applications of Machine Learning
You can also continue expanding your skills in machine learning with upGrad, which will help you deepen your understanding of advanced ML concepts and real-world applications.
upGrad’s courses provide expert training in machine learning, with a focus on different clustering methods, their practical applications, and best practices. Learn how to optimize your machine learning models for different scenarios.
While the course covered in the tutorial can significantly improve your knowledge, here are some free courses to facilitate your continued learning:
You can also get personalized career counseling with upGrad to guide your career path, or visit your nearest upGrad center and start hands-on training today!
Similar Reads:
Expand your expertise with the best resources available. Browse the programs below to find your ideal fit in Best Machine Learning and AI Courses Online.
Discover in-demand Machine Learning skills to expand your expertise. Explore the programs below to find the perfect fit for your goals.
Discover popular AI and ML blogs and free courses to deepen your expertise. Explore the programs below to find your perfect fit.
272 articles published
Get Free Consultation
By submitting, I accept the T&C and
Privacy Policy
Top Resources