Apriori Algorithm: How Does it Work? How Brands Can Utilize Apriori Algorithm?

Imagine you’re at the supermarket, and in your mind, you have the items you wanted to buy. But you end up buying a lot more than you were supposed to. This is called impulsive buying and brands use the apriori algorithm to leverage this phenomenon. Click to learn more if you are interested to learn more about data science algorithms.

What is this algorithm? And how does it work? You’ll find the answers to these questions in this article. We’ll first take a look at what this algorithm is and then at how it works.

Let’s begin. 

What is the Apriori Algorithm?

The apriori algorithm gives you frequent itemsets. Its basis is the apriori property which we can explain in the following way:

Suppose an item set you have has a support value less than the necessary support value. Then, the subsets of this itemset would also have less support value than required. So, you won’t include them in your calculation and as a result, save a lot of space. 

Support value refers to the number of times a particular itemset appears in transactions. The apriori algorithm is quite popular due to its application in recommendation systems. Generally, you’ll apply this algorithm to transactional databases, which means, a database of transactions. There are many real-world applications of this algorithm as well. You should also make yourself familiar with Association Rule Mining to understand the apriori algorithm properly. 

Also read: Prerequisite for Data Science. How does it change over time?

How does the Apriori Algorithm Work?

The apriori algorithm generates association rules by using frequent itemsets. Its principle is simple – the subset of a frequent itemset would also be a frequent itemset. An itemset that has a support value greater than a threshold value is a frequent itemset. Consider the following data:

 

TID Items
T1 1 3 4
T2 2 3 5
T3 1 2 3 5
T4 2 5
T5 1 3 5

 

In the first iteration, suppose the support value is two and make the itemsets with size 1. Now calculate their support values accordingly. We would discard the item which would have a support value lower than the minimum one. In this example, that would be item number four. 

C1 (Result of the first iteration)

Itemset Support
{1} 3
{2} 3
{3} 4
{4} 1
{5} 4

 

F1 (After we discard {4})

Itemset Support
{1} 3
{2} 3
{3} 4
{5} 4

 

In the second iteration, we’ll keep the size of the itemsets two and then calculate the support values. We’ll use all the combinations of table F1 in this iteration. We’ll remove any itemsets that would have support values less than two. 

C2 (Only has items present in F1)

Itemset Support
{1,2} 1
{1,3} 3
{1,5} 2
{2,3} 2
{2,5} 3
{3,5} 3

 

F2 (After we remove items that have support values lower than 2)

 

Itemset Support
{1,3} 3
{1,5} 2
{2,3} 2
{2,5} 3
{3,5} 3

 

Now, we’ll perform pruning. In this case, we’ll divide the itemsets of C3 into subsets and remove the ones that have a support value lower than two. 

C3 (After we perform pruning)

 

Itemset In F2?
{1,2,3}, {1,2}, {1,3}, {2,3} NO
{1,2,5}, {1,2}, {1,5}, {2,5} NO
{1,3,5}, {1,5}, {1,3}, {3,5} YES
{2,3,5}, {2,3}, {2,5}, {3,5} YES

 

In the third iteration, we’ll discard {1,2,5} and {1,2,3} as they both have {1,2}. This is the main impact of the apriori algorithm. 

F3 (After we discard {1,2,5} and {1,2,3})

 

Itemset Support
{1,3,5} 2
{2,3,5} 2

 

In the fourth iteration, we’ll use the sets of F3 to create C4. however, as the support value of C4 is lower than 2, we wouldn’t proceed and the final itemset is F3. 

C3 

 

Itemset Support
{1,2,3,5} 1

 

We’ve got the following itemsets with F3:

For I = {1,3,5}, the subsets we have are {5}, {3}, {1}, {3,5}, {1,5}, {1,3}

For I = {2,3,5}, the subsets we have are {5}, {3}, {2}, {3,5}, {2,5}, {2,3}

 

Now, we’ll create and apply rules on the itemset F3. For that purpose, we’ll assume that the minimum confidence value is currently 60%. For subsets S of I, here’s the rule we output:

  • S -> (I,S) (this means S recommends I-S)
  • If support(I) / support(S) >= min_conf value

Let’s do this for the first subset we have, i.e., {1,3,5}

 

Rule no.1: {1,3} -> ({1,3,5} – {1,3}) this means 1 & 3-> 5

 

Confidence value = support value of (1,3,5) / support value of (1,3) = ⅔ = 66.66%

As the result is higher than 60%, we select Rule no.1.

 

Rule no.2: {1,5} -> {(1,3,5) – {1,5}) this means 1 & 5 -> 3

 

Confidence value = support value of (1,3,5) / support value of (1,5) = 2/2 = 100%

As the result is higher than 60%, we select Rule no.2.

 

Rule no.3: {3} -> ({1,3,5} – {3}) this means 3 -> 1 & 5

 

Confidence value = support value of (1,3,5) / support value of (3) = 2/4 = 50%

As the result is lower than 60%, we reject Rule no.3.

 

With the example above, you can see how the Apriori algorithm creates and applies rules. You can follow these steps for the second item set ({2,3,5}) we have. Trying it out will surely give you a great experience in understanding what rules the algorithm accepts and which ones it rejects. The algorithm remains the same in other places such as the Apriori algorithm Python. 

Conclusion

After reading this article, we’re sure that you’d be quite familiar with this algorithm and its application. Due to its use in recommendation systems, it has become quite popular as well. 

If you are curious to learn about data science and its algorithms, check out IIIT-B & upGrad’s PG Diploma in Data Science which is created for working professionals and offers 10+ case studies & projects, practical hands-on workshops, mentorship with industry experts, 1-on-1 with industry mentors, 400+ hours of learning and job assistance with top firms.

Prepare for a Career of the Future

PG DIPLOMA FROM IIIT-B, 100+ HRS OF CLASSROOM LEARNING, 400+ HRS OF ONLINE LEARNING & 360 DEGREES CAREER SUPPORT
Learn More

Leave a comment

Your email address will not be published. Required fields are marked *

×
Aspire to be a Data Scientist
Download syllabus & join our Data Science Program and develop practical knowledge & skills.
Download syllabus
By clicking Download syllabus,
you agree to our terms and conditions and our privacy policy.