# Linear Regression Using Gradient Descent Method [Explained With Coding Example]

Machine learning is still making rounds no matter whether you are aspiring to be a software developer, data scientist, or data analyst. To make serious efforts in linear regression, you must be well versed with Python. Getting started on an initial phase might be a tedious task, this article will help you understand regression more thoroughly.

The gradient descent method is opted in various iterations because of the optimization techniques it has to offer. With the algorithm, it is feasible to reduce the size, for example, Logistic regression and neural network. Before starting off with gradient let’s just have a look over Linear regression.

## What is Linear Regression?

It is the linear approach that is taken towards the modelling of the relationship between a dependent variable and one or more independent variables. The linear relationship between such variables would be expressed in a y= mx+b equation format.

It is a monitored machine learning algorithm that will enhance its learning curve from a given x dependent variable and y as the other responsible for causing an effect. This nature helps predict the values and certainty factors for x and y.

The use of gradient regression involves optimizing the algorithm used to find the values of required parameters of a function which enables to directly minimize the cost of a function.

Let us understand the concept with a scenario, imagine you want to descend a fort in a pitch dark surrounding. During this let’s assume that you are completely handicapped and need to come up with the shortest and easiest distance to come back down. Gradient descent will be the resource used to find out the optimized way to reach your destination. With a basic directional input, the algorithm would be possible to chart and suggest the best route.

The gradient is one the most used and widely accepted algorithms in machine learning it is also considered to lay the foundation to mastering machine learning in the earlier stages.

For a better approximation of gradient let’s try to implement it with a code in a sample, working on python with the help of NumPy.

from NumPy import *

# y = mx + b

# m is slope, b is the y-intercept

def compute_error_for_line_given_points(b, m, points):

totalError = 0

for i in range(0, len(points)):

x = points[i, 0]

y = points[i, 1]

totalError += (y – (m * x + b)) ** 2

N = float(len(points))

for i in range(0, len(points)):

x = points[i, 0]

y = points[i, 1]

b_gradient += -(2/N) * (y – ((m_current * x) + b_current))

m_gradient += -(2/N) * x * (y – ((m_current * x) + b_current))

new_b = b_current – (learningRate * b_gradient)

new_m = m_current – (learningRate * m_gradient)

return [new_b, new_m]

def gradient_descent_runner(points, starting_b, starting_m, learning_rate, num_iterations):

b = starting_b

m = starting_m

for i in range(num_iterations):

b, m = step_gradient(b, m, array(points), learning_rate)

return [b, m]

def run():

points = genfromtxt(“data.csv”, delimiter=”,”)

learning_rate = 0.0001

initial_b = 0 # initial y-intercept guess

initial_m = 0 # initial slope guess

num_iterations = 1000

print “Starting gradient descent at b = {0}, m = {1}, error = {2}”.format(initial_b, initial_m, compute_error_for_line_given_points(initial_b, initial_m, points))

print “Running…”

[b, m] = gradient_descent_runner(points, initial_b, initial_m, learning_rate, num_iterations)

print “After {0} iterations b = {1}, m = {2}, error = {3}”.format(num_iterations, b, m, compute_error_for_line_given_points(b, m, points))

if __name__ == ‘__main__’:

run()

Code reference

This is a visual representation of the gradient search program where the problems are solved in the linear regression by plotting the points in a single line. The code is a demonstration of how it works and helps to set several points along a line. Gradient descent attempts to find the best values for these parameters concerning an error function.

The code contains a peculiar function labeled ‘run’. It helps define a set of parameters used in the algorithm to make an initial set of predictions based on the behavior of the variables and the line slope. The other factors involve the number of iterations required to achieve the gradient descent in the format shown below:

initial_b = 0 # initial y-intercept guess

initial_m = 0 # initial slope guess

num_iterations = 1000

You can easily come to an understanding that the Gradient method is quite simple and straightforward. Once you understand its functioning abilities then the only part you need to focus on is the cost of function that you are interested in optimizing.

The goal is to make continuous efforts to make different iterations for each of the values of the variables, to evaluate their costs, and to create new variables that would initiate a better and low cost in the program.

Must Read: Machine Learning Interview Questions

### 1. Learning Rate

The optimization protocol helps to reduce the learning rate value even at smaller decimals, try to shuffle different values suitable for the platform, and then opt for the best working value. The learning can be much faster and fruitful, to make that happen make sure to limit the number of passes through each dataset. The ideal number would be between 1 to 10.

### 2. Plot Mean Cost

The training time for each dataset instance can cause a delay due to the extra time taken during running the algorithm.  For better results choose the average over 100 or 1000 for better odds of finding a better learning trend for the algorithm.

## Summary

In this article you learned about gradient and how to create such an algorithm, this helps to make precise and more effective predictions with a learned regression model. To understand on a much comprehensive and deeper level with real case scenarios, enroll with upGrad. We offer curated courses specially structured for aspiring Data Scientists and Machine Learning applicants.

## Lead the AI Driven Technological Revolution

PG DIPLOMA IN MACHINE LEARNING AND ARTIFICIAL INTELLIGENCE 