Blog_Banner_Asset
    Homebreadcumb forward arrow iconBlogbreadcumb forward arrow iconArtificial Intelligencebreadcumb forward arrow iconDecision Tree Regression Functionality, Terms, Implementation [With Example]

Decision Tree Regression Functionality, Terms, Implementation [With Example]

Last updated:
24th Dec, 2020
Views
Read Time
6 Mins
share image icon
In this article
Chevron in toc
View All
Decision Tree Regression Functionality, Terms, Implementation [With Example]

To begin with, a regression model is a model that gives as output a numeric value when given some input values that are also numeric. This differs from what a classification model does. It classifies the test data into various classes or groups involved in a given problem statement.

Best Machine Learning and AI Courses Online

The size of the group can be as small as 2 and as big as 1000 or more. There are multiple regression models like linear regression, multivariate regression, Ridge regression, logistic regression, and many more. Decision tree regression models also belong to this pool of regression models.

The predictive model will either classify or predict a numeric value that makes use of binary rules to determine the output or target value. The decision tree model, as the name suggests, is a tree like model that has leaves, branches, and nodes.

Ads of upGrad blog

In-demand Machine Learning Skills

Learn Machine Learning Online Course from the World’s top Universities. Earn Masters, Executive PGP, or Advanced Certificate Programs to fast-track your career.

Read: Machine Learning Project Ideas

Terminologies to Remember

Before we delve into the algorithm, here are some important terminologies that you all should be aware of.

  1. Root node: It is the topmost node from where the splitting begins.
  2. Splitting: Process of subdividing a single node into multiple sub-nodes. 
  3. Terminal node or leaf node: Nodes that don’t split further are called terminal nodes. 
  4. Pruning: The process of removal of sub nodes .
  5. Parent node: The node that splits further into sub nodes.
  6. Child node: The sub nodes that have emerged out from the parent node.

How does it work?

The decision tree breaks down the data set into smaller subsets. A decision leaf splits into two or more branches that represent the value of the attribute under examination. The topmost node in the decision tree is the best predictor called the root node. ID3 is the algorithm that builds up the decision tree.

It employs a top to down approach and splits are made based on standard deviation. Just for a quick revision, Standard deviation is the degree of distribution or dispersion of a set of data points from its mean value. It quantifies the overall variability of the data distribution.

A higher value of dispersion or variability means greater is the standard deviation indicating the greater spread of the data points from the mean value. We use standard deviation to measure the uniformity of the sample. If the sample is totally homogeneous, its standard deviation is zero.

And similarly, higher is the degree of heterogeneity, greater will be the standard deviation. Mean of the sample and the number of samples are required to calculate standard deviation. We use a mathematical function — Coefficient of Deviation that decides when the splitting should stop It is calculated by dividing the standard deviation by the mean of all the samples.

Source

The final value would be the average of the leaf nodes. Say, for example, if the month November is the node that splits further into various salaries over the years in the month of November (until 2020). For the year 2021, the salary for the month of November would be the average of all the salaries under the node November.

Moving on to the standard deviation of two classes or attributes(like for the above example, salary can be based either on an hourly basis or monthly basis). The formula would look like the following:

Source

where P(c) is the probability of occurrence of the attribute c, S(c)is the corresponding standard deviation of the attribute c. The method of reduction in standard deviation is based on the decrease in standard deviation after a dataset has split.

To construct an accurate decision tree, the goal should be to find attributes that return upon calculation, and return the highest standard deviation reduction. In simple words, the most homogenous branches.

The process of creating a Decision tree for regression covers four important steps.

1. Firstly, we calculate the standard deviation of the target variable. Consider the target variable to be salary like in previous examples. With the example in place, we will calculate the standard deviation of the set of salary values.

2. In step 2, the data set is further split into different attributes. talking about attributes, as the target value is salary, we can think of the possible attributes as — months, hours, the mood of the boss, designation, year in the company, and so on. Then, the standard deviation for each branch is calculated using the above formula. the standard deviation so obtained is subtracted from the standard deviation before the split. The result at hand is called the standard deviation reduction.

3. Once the difference has been calculated as mentioned in the previous step, the best attribute is the one for which the standard deviation reduction value is largest. That means, the standard deviation before the split should be greater than the standard deviation before the split. Actually, mod of the difference is taken and so vice versa is also possible.

4. The entire dataset is classified based on the importance of the selected attribute. On the non-leaf branches, this method is continued recursively till all the available data is processed. Now consider month is selected as the best splitting attribute based on the standard deviation reduction value. So we will have 12 branches for each month. These branches will further split to select the best attribute from the remaining set of attributes.

5. In reality, we require some finishing criteria. For this, we make use of the coefficient of deviation or CV for a branch that becomes smaller than a certain threshold like 10%. When we achieve this criterion we stop the tree building process. Because no further splitting happens, the value that falls under this attribute will be the average of all the values under that node.

Implementation

Decision Tree Regression can be implemented using Python language and scikit-learn library. It can be found under the sklearn.tree.DecisionTreeRegressor.

Some of the important parameters are as follows:

  1. criterion: To measure the quality of a split. It’s value can be “mse” or the mean squared error, “friedman_mse”, and “mae” or the mean absolute error. Default value is mse.
  2. max_depth: It represents the maximum depth of the tree. Default value is None.
  3. max_features: It represents the number of features to look for when deciding the best split. Default value is None. 
  4. splitter: This parameter is used to choose the split at each node. Available values are “best” and “random”. Default value is best.

Check out: Machine Learning Interview Questions

Example from sklearn documentation

>>> from sklearn.datasets import load_diabetes

>>> from sklearn.model_selection import cross_val_score

>>> from sklearn.tree import DecisionTreeRegressor

>>> X, y = load_diabetes(return_X_y=True)

>>> regressor = DecisionTreeRegressor(random_state=0)

>>> cross_val_score(regressor, X, y, cv=10)

                  # doctest: +SKIP

array([-0.39…, -0.46…,  0.02…,  0.06…, -0.50…,

Ads of upGrad blog

      0.16…,  0.11…, -0.73…, -0.30…, -0.00…])

Popular AI and ML Blogs & Free Courses

What Next?

Also, If you’re interested to learn more about Machine learning, check out IIIT-B & upGrad’s Executive PG Programme in Machine Learning & AI which is designed for working professionals and offers 450+ hours of rigorous training, 30+ case studies & assignments, IIIT-B Alumni status, 5+ practical hands-on capstone projects & job assistance with top firms.

Profile

Pavan Vadapalli

Blog Author
Director of Engineering @ upGrad. Motivated to leverage technology to solve problems. Seasoned leader for startups and fast moving orgs. Working on solving problems of scale and long term technology strategy.
Get Free Consultation

Selectcaret down icon
Select Area of interestcaret down icon
Select Work Experiencecaret down icon
By clicking 'Submit' you Agree to  
UpGrad's Terms & Conditions

Our Popular Machine Learning Course

Explore Free Courses

Suggested Blogs

15 Interesting MATLAB Project Ideas & Topics For Beginners [2024]
82460
Diving into the world of engineering and data science, I’ve discovered the potential of MATLAB as an indispensable tool. It has accelerated my c
Read More

by Pavan Vadapalli

09 Jul 2024

5 Types of Research Design: Elements and Characteristics
47126
The reliability and quality of your research depend upon several factors such as determination of target audience, the survey of a sample population,
Read More

by Pavan Vadapalli

07 Jul 2024

Biological Neural Network: Importance, Components & Comparison
50612
Humans have made several attempts to mimic the biological systems, and one of them is artificial neural networks inspired by the biological neural net
Read More

by Pavan Vadapalli

04 Jul 2024

Production System in Artificial Intelligence and its Characteristics
86790
The AI market has witnessed rapid growth on the international level, and it is predicted to show a CAGR of 37.3% from 2023 to 2030. The production sys
Read More

by Pavan Vadapalli

03 Jul 2024

AI vs Human Intelligence: Difference Between AI & Human Intelligence
112991
In this article, you will learn about AI vs Human Intelligence, Difference Between AI & Human Intelligence. Definition of AI & Human Intelli
Read More

by Pavan Vadapalli

01 Jul 2024

Career Opportunities in Artificial Intelligence: List of Various Job Roles
89554
Artificial Intelligence or AI career opportunities have escalated recently due to its surging demands in industries. The hype that AI will create tons
Read More

by Pavan Vadapalli

26 Jun 2024

Gini Index for Decision Trees: Mechanism, Perfect & Imperfect Split With Examples
70806
As you start learning about supervised learning, it’s important to get acquainted with the concept of decision trees. Decision trees are akin to
Read More

by MK Gurucharan

24 Jun 2024

Random Forest Vs Decision Tree: Difference Between Random Forest and Decision Tree
51730
Recent advancements have paved the growth of multiple algorithms. These new and blazing algorithms have set the data on fire. They help in handling da
Read More

by Pavan Vadapalli

24 Jun 2024

Basic CNN Architecture: Explaining 5 Layers of Convolutional Neural Network
270718
Introduction In the last few years of the IT industry, there has been a huge demand for once particular skill set known as Deep Learning. Deep Learni
Read More

by MK Gurucharan

21 Jun 2024

Schedule 1:1 free counsellingTalk to Career Expert
icon
footer sticky close icon