Home
Blog
Data Science
World Happiness Report Analysis in R With Code

World Happiness Report Analysis in R With Code

Updated on Aug 05, 2025 | 11 min read | 1.8K+ views

Table of Contents

View all

What You Must Understand Before Starting the World Happiness Report Analysis in R
The Tools and R Libraries Used In This Analysis
How Long Does It Take and How Hard Is This Project
A Breakdown of the World Happiness Report Analysis Project Using R
Conclusion

This project presents a World Happiness Report Analysis in R using the 2019 dataset. We’ve used Google Colab for this project to run the R code.

The analysis looks at the global happiness scores and their relationship with key factors like GDP, social support, life expectancy, and corruption. Through data cleaning, visualization, and correlation analysis, the project identifies trends and patterns that influence national well-being.

Supercharge your data science career with upGrad’s top-tier online data science programs. Master Python, Machine Learning, AI, Tableau, and SQL, taught by industry experts. Begin your journey to the forefront of tech today.

Here’s A Must-Read: Top 25+ R Projects for Beginners to Boost Your Data Science Skills in 2025

Popular Data Science Programs

MSc AI and Data Science Program DevOps Course Online Masters in Data Science Degree PGD in Data Science Post Graduate Certificate in Data Science

What You Must Understand Before Starting the World Happiness Report Analysis in R

Here are the key concepts and basics you should know:

Basic R Syntax: Understand how to write and run R code, especially in Google Colab.
Working with Data Frames: Learn how to load, view, and manipulate tabular data.
Data Cleaning: Know how to check for and handle missing or inconsistent data.
Exploratory Data Analysis (EDA): Be familiar with summarizing and visualizing data to uncover insights.
Correlation Analysis: Understand how different variables relate to each other statistically.
Plotting with ggplot2: Learn how to create clear and informative visualizations using this powerful R package.

Step into the future with upGrad’s globally recognized programs in Data Science and AI. From foundational certificates to master’s degrees, gain hands-on expertise in Generative AI, Machine Learning, and Advanced Analytics. Apply now and lead the change.

The Tools and R Libraries Used In This Analysis

Before starting the analysis, it's important to know the tools and libraries that make this project possible. Each component plays a specific role, making it easier to work with and understand the World Happiness dataset.

Tool / Library	Purpose
Google Colab	Cloud-based platform to run R code without needing local setup
R Language	Programming language used for data analysis and visualization
tidyverse	Collection of R packages for data wrangling and analysis
ggplot2	Visualization library for creating high-quality charts and graphs
readr	Helps in reading and writing CSV files efficiently
corrplot	Used to generate correlation matrix visualizations

Data Science Courses to upskill

Explore Data Science Courses for Career Progression

Liverpool John Moores University

MS in Data Science

Double Credentials

Master's Degree18 Months

IIIT Bangalore

Executive Post Graduate Certificate in Data Science & AI

Placement Assistance

Certification6 Months

How Long Does It Take and How Hard Is This Project

To help you plan better, here's a quick overview of the time commitment, complexity, and who this project is best suited for:

Aspect	Details
Estimated Duration	1–2 hours
Difficulty Level	Beginner-friendly
Skill Level Needed	Basic understanding of R and data manipulation

A Breakdown of the World Happiness Report Analysis Project Using R

The section below will explain the step-by-step process of building this World Happiness Report Analysis, along with the code and output for each step.

Step 1: Getting R Ready in Google Colab

Google Colab runs Python by default, so a quick setup is needed before writing any R code. The good news is, it only takes a few clicks to switch the environment.

To enable R in your notebook:

Launch a new notebook at Google Colab
Navigate to the top menu and select Runtime
Choose Change runtime type
In the dialog that appears, switch the language to R
Click Save, and you're all set to start coding in R

Must Read R Project: Customer Segmentation Project Using R: A Step-by-Step Guide

Step 2: Install and Load the Required R Packages

Before we can begin analyzing the data, we need to make sure the right tools are in place. R uses packages (similar to plugins) for various functions, like reading files, visualizing data, or creating charts. In this step, you'll install and load all the packages necessary for the project. The code is given below:

# Install the required packages (this needs to be done only once)
install.packages("tidyverse")   # Includes packages for data manipulation and visualization
install.packages("ggplot2")     # Used to create high-quality visualizations
install.packages("readr")       # Helps in reading CSV and other flat files
install.packages("corrplot")    # For creating correlation matrix plots

# Load the libraries into your R session
library(tidyverse)   # Loads dplyr, ggplot2, readr, etc.
library(ggplot2)     # For plotting data
library(readr)       # For reading CSV files
library(corrplot)    # For plotting correlations

The output for the above code confirms the installation and loading of the libraries and packages:

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.2 ✔ tibble 3.3.0
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.1.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
corrplot 0.95 loaded

Step 3: Load the World Happiness Dataset into R

With the packages ready, it's time to bring in the dataset. In this step, you'll read the CSV file into R so you can start exploring and analyzing the data. We'll also take a quick look at the contents and structure. Here’s the code:

# Read the CSV file and store it in a variable called 'happiness'
happiness <- read_csv("Happiness Report - 2019.csv")

# Display the first few rows of the dataset to understand how it looks
head(happiness)

# Show the names of all columns in the dataset
colnames(happiness)

The output of this code gives us an overview of the dataset:

Rows: 156 Columns: 9

── Column specification ────────────────────────────────────────────────────────

Delimiter: ","

chr (1): Country or region

dbl (8): Overall rank, Score, GDP per capita, Social support, Healthy life e...

ℹ Use `spec()` to retrieve the full column specification for this data.

ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Overall rank	Country or region	Score	GDP per capita	Social support	Healthy life expectancy	Freedom to make life choices	Generosity	Perceptions of corruption
<dbl>	<chr>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>	<dbl>
1	Finland	7.769	1.340	1.587	0.986	0.596	0.153	0.393
2	Denmark	7.600	1.383	1.573	0.996	0.592	0.252	0.410
3	Norway	7.554	1.488	1.582	1.028	0.603	0.271	0.341
4	Iceland	7.494	1.380	1.624	1.026	0.591	0.354	0.118
5	Netherlands	7.488	1.396	1.522	0.999	0.557	0.322	0.298
6	Switzerland	7.480	1.452	1.526	1.052	0.572	0.263	0.343

'Overall rank'
'Country or region'
'Score'
'GDP per capita'
'Social support'
'Healthy life expectancy'
'Freedom to make life choices'
'Generosity'
'Perceptions of corruption'

Step 4: Explore the Structure and Quality of the Data

Before diving into analysis, it’s important to understand what the dataset contains. This step helps you examine the structure, check basic statistics, and identify any missing values that could affect the results. Here’s the code for this step:

# Check the structure of the dataset: data types and format of each column
str(happiness)

# Get summary statistics: min, max, mean, median for each numeric column
summary(happiness)

# Check for missing values in each column
colSums(is.na(happiness))

The output for this code is:

spc_tbl_ [156 × 9] (S3: spec_tbl_df/tbl_df/tbl/data.frame)

$ Overall rank : num [1:156] 1 2 3 4 5 6 7 8 9 10 ...

$ Country or region : chr [1:156] "Finland" "Denmark" "Norway" "Iceland" ...

$ Score : num [1:156] 7.77 7.6 7.55 7.49 7.49 ...

$ GDP per capita : num [1:156] 1.34 1.38 1.49 1.38 1.4 ...

$ Social support : num [1:156] 1.59 1.57 1.58 1.62 1.52 ...

$ Healthy life expectancy : num [1:156] 0.986 0.996 1.028 1.026 0.999 ...

$ Freedom to make life choices: num [1:156] 0.596 0.592 0.603 0.591 0.557 0.572 0.574 0.585 0.584 0.532 ...

$ Generosity : num [1:156] 0.153 0.252 0.271 0.354 0.322 0.263 0.267 0.33 0.285 0.244 ...

$ Perceptions of corruption : num [1:156] 0.393 0.41 0.341 0.118 0.298 0.343 0.373 0.38 0.308 0.226 ...

- attr(*, "spec")=

.. cols(

.. `Overall rank` = col_double(),

.. `Country or region` = col_character(),

.. Score = col_double(),

.. `GDP per capita` = col_double(),

.. `Social support` = col_double(),

.. `Healthy life expectancy` = col_double(),

.. `Freedom to make life choices` = col_double(),

.. Generosity = col_double(),

.. `Perceptions of corruption` = col_double()

.. )

- attr(*, "problems")=<externalptr>

Overall rank Country or region Score GDP per capita

Min. : 1.00 Length:156 Min. :2.853 Min. :0.0000

1st Qu.: 39.75 Class :character 1st Qu.:4.545 1st Qu.:0.6028

Median : 78.50 Mode :character Median :5.380 Median :0.9600

Mean : 78.50 Mean :5.407 Mean :0.9051

3rd Qu.:117.25 3rd Qu.:6.184 3rd Qu.:1.2325

Max. :156.00 Max. :7.769 Max. :1.6840

Social support Healthy life expectancy Freedom to make life choices

Min. :0.000 Min. :0.0000 Min. :0.0000

1st Qu.:1.056 1st Qu.:0.5477 1st Qu.:0.3080

Median :1.272 Median :0.7890 Median :0.4170

Mean :1.209 Mean :0.7252 Mean :0.3926

3rd Qu.:1.452 3rd Qu.:0.8818 3rd Qu.:0.5072

Max. :1.624 Max. :1.1410 Max. :0.6310

Generosity Perceptions of corruption

Min. :0.0000 Min. :0.0000

1st Qu.:0.1087 1st Qu.:0.0470

Median :0.1775 Median :0.0855

Mean :0.1848 Mean :0.1106

3rd Qu.:0.2482 3rd Qu.:0.1412

Max. :0.5660 Max. :0.4530

Overall rank 0 Country or region 0 Score 0 GDP per capita 0 Social support 0 Healthy life expectancy 0 Freedom to make life choices 0 Generosity 0 Perceptions of corruption 0

Here’s a Project in R For You: Car Data Analysis Project Using R

Step 5: Simplify Column Names for Easier Use

Some of the column names in the dataset are long or contain spaces, which can make them harder to work with in code. This step renames those columns to shorter, cleaner versions for simplicity and readability. Here’s the code:

# Rename long or complex column names to simpler, more usable ones
happiness <- happiness %>%
  rename(
    Country = `Country or region`,
    Score = `Score`,
    GDP = `GDP per capita`,
    Social = `Social support`,
    Life = `Healthy life expectancy`,
    Freedom = `Freedom to make life choices`,
    Generosity = `Generosity`,
    Corruption = `Perceptions of corruption`
  )

# Confirm the updated column names
colnames(happiness)

The output for this code is:

'Overall rank'
'Country'
'Score'
'GDP'
'Social'
'Life'
'Freedom'
'Generosity'
'Corruption'

Step 6: Visualize the Top 10 Happiest Countries

Now that the data is clean, let’s highlight the countries with the highest happiness scores. This bar plot visually compares the top 10 happiest nations based on their overall score in the 2019 report. Here’s the code:

# Select and sort the top 10 countries with the highest happiness scores
top10 <- happiness %>% arrange(desc(Score)) %>% head(10)

# Create a horizontal bar chart to display the top 10 happiest countries
ggplot(top10, aes(x = reorder(Country, Score), y = Score, fill = Country)) +
  geom_bar(stat = "identity") +            # Draw bars based on the Score
  coord_flip() +                           # Flip the chart to make it horizontal
  labs(title = "Top 10 Happiest Countries in 2019", 
       x = "Country", 
       y = "Happiness Score") +
  theme_minimal()                          # Use a clean, minimal chart theme

The output for the above code gives us a graph of the top 10 happiest countries in 2019:

Step 7: Analyze Correlations Between Happiness Factors

To understand how different factors (like GDP, social support, and life expectancy) relate to happiness, we’ll create a correlation matrix. This step helps us see which variables are positively or negatively associated with the happiness score. Here’s the code:

# Select only the numeric columns needed for correlation analysis
numeric_data <- happiness %>% select(Score, GDP, Social, Life, Freedom, Generosity, Corruption)

# Calculate the correlation matrix between selected numeric columns
cor_matrix <- cor(numeric_data)

# Visualize the correlation matrix using color-coded blocks and numeric values
corrplot(cor_matrix, method = "color",     # Use colored squares to show strength of correlation
         type = "upper",                   # Show only the upper triangle of the matrix
         addCoef.col = "black",            # Add correlation values in black text
         tl.cex = 0.8)                     # Adjust label size

The output gives a correlation plot between various happiness factors:

The above plot means that:

Happiness is strongly linked to GDP, social support, and life expectancy; countries with higher values in these areas tend to have higher happiness scores.
Freedom and low corruption also have a noticeable positive effect on happiness, but not as strong as GDP or social support.
Generosity has a very weak connection to overall happiness in this dataset.
Darker blue boxes mean stronger relationships between factors, while lighter ones show weaker or no relationship.

Build This Interesting R Project: Wine Quality Prediction Project in R

Step 8: Visualize the Relationship Between GDP and Happiness Score

Now let’s look into one of the strongest correlations we saw earlier, between GDP per capita and happiness score. In this step, we’ll use a scatter plot with a trend line to show how happiness changes with economic prosperity. Here’s the code for this step:

# Create a scatter plot to visualize the relationship between GDP and happiness score
ggplot(happiness, aes(x = GDP, y = Score)) +
  geom_point(color = "blue") +                # Plot each country as a blue dot
  geom_smooth(method = "lm", se = FALSE, color = "red") +  # Add a red trend line without confidence band
  labs(title = "Happiness Score vs GDP per Capita", 
       x = "GDP per Capita", 
       y = "Happiness Score") +
  theme_minimal()                            # Use a clean chart style

The output for the above step gives us a graphical representation of the Happiness Score vs GDP Per Capita

The above graph shows that:

There is a clear upward trend: as GDP per capita increases, the happiness score also tends to increase.
The red line confirms a strong positive linear relationship between economic wealth and happiness.
Most points cluster around the trend line, showing that GDP is a reliable predictor of happiness in this dataset.

Conclusion

In this World Happiness Report Analysis in R, we explored the 2019 dataset using Google Colab to understand how various factors impact a country's happiness score.

After cleaning and simplifying the data, we visualized top-performing countries and examined correlations between variables such as GDP, social support, life expectancy, and perceptions of corruption.

Our analysis revealed strong positive relationships between happiness and GDP, social support, and life expectancy. A scatter plot confirmed GDP as a key influencer of happiness.

Unlock the power of data with our popular Data Science courses, designed to make you proficient in analytics, machine learning, and big data!

Explore our Popular Data Science Courses

Executive Post Graduate Programme in Data Science from IIITB	Data Science Bootcamp with AI	Master of Science in Data Science from LJMU
Advanced Certificate Programme in Data Science from IIITB	Professional Certificate Program in Data Science and Business Analytics from University of Maryland	Data Science Courses

Elevate your career by learning essential Data Science skills such as statistical modeling, big data processing, predictive analytics, and SQL!

Top Data Science Skills to Learn

Data Analysis Course	Inferential Statistics Courses
Hypothesis Testing Programs	Logistic Regression Courses
Linear Regression Courses	Linear Algebra for Analysis

Stay informed and inspired with our popular Data Science articles, offering expert insights, trends, and practical tips for aspiring data professionals!

Read our popular Data Science Articles

Is Data Science Hard to Learn	Data Science Career Growth	What Is Data Science? Courses, Basics, Frameworks & Careers
Future of Data Science in India	The Ultimate Data Science Cheat Sheet Every Data Scientists Should Have	How to Become a Data Scientist

Colab Link:
https://colab.research.google.com/drive/10WLUd3o68yQuFvViEqL5nYS2Wmv_NiNA#scrollTo=5whGKTOtjcRL

Frequently Asked Questions (FAQs)

1. What is the World Happiness Report Analysis in R all about?

This project explores the 2019 World Happiness dataset using R in Google Colab to understand which factors contribute most to a country's happiness score. It includes data cleaning, visualization, and correlation analysis to uncover patterns in variables like GDP, life expectancy, and social support.

2. Which tools and libraries are required to run this project in R?

You’ll be using Google Colab with the R runtime. Essential R libraries include tidyverse for data manipulation, ggplot2 for plotting, readr for reading CSV files, and corrplot for visualizing correlations between variables.

3. Can this analysis be extended with machine learning algorithms?

Yes, you can build on this project by applying predictive models such as Linear Regression, Random Forest, or K-Nearest Neighbors (KNN) to estimate happiness scores or classify countries based on risk factors or development levels.

4. What are some beginner-friendly R projects similar to this one?

If you’re looking for more one-dataset projects ideal for R beginners, try these:

5. Is Google Colab suitable for running R projects?

Yes. Google Colab can be configured to run R code with just a few steps. It supports installing packages, importing data, and producing visualizations, making it perfect for R learners and quick project setups.

Rohit Sharma

849 articles published

Rohit Sharma is the Head of Revenue & Programs (International), with over 8 years of experience in business analytics, EdTech, and program management. He holds an M.Tech from IIT Delhi and specializes...

Speak with Data Science Expert

By submitting, I accept the T&C and
Privacy Policy

Start Your Career in Data Science Today

Top Resources