Education (UG/PG) Programs for Professionals, Online Degree Courses
  • Data Science & Analytics
  • Machine Learning & AI
  • Doctorate of Business Administration
  • MBA
  • More
    • Product and Project Management
    • Digital Marketing
    • Management
    • Coding & Blockchain
    • General
    • Account & Finance
No Result
View All Result
  • Data Science & Analytics
  • Machine Learning & AI
  • Doctorate of Business Administration
  • MBA
  • More
    • Product and Project Management
    • Digital Marketing
    • Management
    • Coding & Blockchain
    • General
    • Account & Finance
No Result
View All Result
Education (UG/PG) Programs for Professionals, Online Degree Courses
Home USA Blog Machine Learning & AI Step-by-Step Guide to Implementing Linear Regression with Python

Step-by-Step Guide to Implementing Linear Regression with Python

Vamshi Krishna sanga by Vamshi Krishna sanga
August 5, 2025
in Machine Learning & AI
Implementing Linear Regression with Python
Share on TwitterShare on Facebook

Linear regression is a commonly used statistical technique to model relationships between two variables. 

This article will cover the key steps to implement simple linear regression in Python from scratch. We’ll use a salary dataset to predict salaries based on years of experience. 

Read on to learn this versatile machine-learning technique.  

Understanding Linear Regression 

Linear regression establishes a linear relationship between a dependent variable (y) and one or more independent variables (x). The goal is to find the best linear model that predicts the value of y from x. 

The linear equation takes the form:

y = mx + b

Where m is the slope and b is the y-intercept.

The key assumptions are:

  1. Linear relationship: The dependent variable changes linearly concerning changes in the independent variable. A linear regression line best captures this relationship.
  2. No multicollinearity: The independent variables should not be highly correlated. 
  3. Minimal errors: The differences between the observed and predicted values of y (residuals) should follow a normal distribution centred around 0.
  4. Homoscedasticity: The variance of residuals should not change substantially across the range of values for the independent variables

In Python, the Sklearn library provides easy tools for building linear regression models. We’ll use this step-by-step to build our model.

Import Libraries and Data 

multiple linear regression

First, we import pandas and numpy for data manipulation, matplotlib and seaborn for visualisation, and sklearn modules for modelling. We load the salary dataset into a pandas frame. 

Information is available on the number of years individuals have been working (x) and their earnings (y). The number of years someone has worked can be used to determine their income.

Split Data into Training and Test Sets  

It divides the dataset into an 80:20 ratio of a training set and a test set using Sklearn’s train_test_split method. The model will be trained on the training data, while the test data will be used later to evaluate performance.

We separate the independent (years of experience) and dependent (salary) variables from the data into X and y arrays for modelling.

Train the Linear Regression Model    

Sklearn provides a linear regression class to train linear regression models. We initialise a Linear Regression estimator and call the .fit() method, passing in the training data (X_train and y_train).

The .fit() method learns the linear relationship between years of experience (input) and salaries (output) from the training data. This step trains and fits the model to the data.

Make Predictions on Test Data 

Now that the model is trained, it can be used to make predictions on new test data. 

The test input data (X_test) is passed to the .predict() method to predict the output values. This generates the model’s predicted salaries (y_pred_test) corresponding to the test years of experience data.

Predictions on training data (X_train) are also made for comparison.

Evaluate Model Performance 

To determine model effectiveness, we compare predicted salaries to actual salaries visually and numerically:

Visual Evaluation 

We plot the training data and draw the regression line obtained from training. We also show the test data to inspect visually how well the line fits new unseen data. The model makes reasonable predictions.

Numerical Evaluation

Key metrics are:

  1. Coefficient (m) and intercept (b) values: The linear equation learned by the model gives insight into variable relationships.
  2. Difference between actual and predicted salaries (residuals) Lower residuals indicate a better fit. We aim to minimise residuals.
  3. Score metrics like mean absolute error, mean squared error and R-squared numerically quantify model performance. We omit them here for simplicity.

There is potential to improve performance by tuning parameters, adding polynomial terms, trying other algorithms, etc. However, our basic model sufficiently demonstrates the linear regression workflow.

upgrad referral

Conclusion 

This walkthrough covered the essential steps for linear regression in Python: importing data, splitting it into train/test sets, training a model, making predictions, and evaluating performance. We used the sklearn library to build a working model to predict salaries quickly with Python.

Vamshi Krishna sanga

Vamshi Krishna sanga

72 articles published

Previous Post

Understanding Recurrent Neural Networks: Applications and Examples

Next Post

How to Conduct Target Market Analysis: Tools and Techniques

  • Trending
  • Latest
Thesis vs Dissertation: How to Pick

Dissertation vs Thesis: Understanding the Key Differences

August 5, 2025
Path to Data Engineer Success

How to Become a Data Engineer: Key Skills and Job Opportunities

August 8, 2025
Deep Learning: Algorithms & Use Cases

Understanding Deep Learning: From Algorithms to Applications

August 5, 2025
Top Accounting Careers in the US

Top Accounting Careers in the US for 2025 and Beyond

August 19, 2025
Network Your Way in Data Science

Why Data Science Networking Matters for US Online Learners

August 7, 2025
Best AI/ML Certs for US Pros

Top AI and ML Certifications to Boost Your Career in the US

August 7, 2025

Get Free Consultation

Building Careers of Tomorrow

Get the Android App
apple [#173]Created with Sketch. Get the iOS App
Upgrad
  • About
  • Careers
  • Blog
  • Success Stories
  • Online Power Learning
  • For Business
  • upGrad Institute
Support
  • Contact
  • Terms & Conditions
  • Privacy Policy
  • Referral Policy
Browse Courses by Region
  • Courses in Singapore
  • Courses in the UAE
  • Courses in the US
  • Courses in Canada
  • Courses in Australia
  • Courses in Saudi Arabia
  • Courses in the UK
  • Courses in Vietnam
Popular Posts
  • Top Accounting Careers in the US for 2025 and Beyond
  • Why Data Science Networking Matters for US Online Learners
  • Top AI and ML Certifications to Boost Your Career in the US
  • Salaries for Accountants in the US in 2025: What You Can Expect at Different Career Levels
  • Your 2025 Guide to Becoming a Cloud Developer in the US

KEEP UPSKILLING WITH UPGRAD

Ushering the Era of Learning and Innovation
Back in 2015, upGrad’s founders noticed that the future of work demands industry professionals to upskill continuously – not just for their organization’s benefit but also for their personal growth. Earlier, learning would come to a halt as soon as professionals entered the workspace. upGrad brought along novel approaches towards imparting and receiving education by offering people a chance to upskill while working. We have always strived to facilitate quality education to the upcoming workforce through industry-relevant UG and PG programs.

Staying Dynamic and Forward-Looking
From being incepted in 2015 to teaching a learner base of 10k+ in 2018 to crossing the 1M mark in 2020 – upGrad has always focused on staying dynamic and future-centric. This approach has helped us grow as an organization while catering best-in-class learning to our students. In 2021, upGrad became a unicorn with a valuation of $1.2B, expanding to North America, Europe, the Middle East, and the Asia Pacific. Only onwards and upwards from here!

Growing and Expanding Constantly
Growth has been our true constant in this journey. Whether it is entering the unicorn club or winning the Best Career Planning platform award, or being ranked the #1 startup in India per LinkedIn’s 2020 report – we’ve always strived to go above and beyond our current capacities and bring novel ideas to the table for the betterment of learners across the globe. Join us in this revolution and help us impact more lives!

© 2015-2025 upGrad Education Private Limited. All rights reserved  

No Result
View All Result
  • Data Science & Analytics
  • Machine Learning & AI
  • Doctorate of Business Administration
  • MBA
  • More
    • Product and Project Management
    • Digital Marketing
    • Management
    • Coding & Blockchain
    • General
    • Account & Finance