World Happiness Report Analysis in R With Code
By Rohit Sharma
Updated on Aug 05, 2025 | 11 min read | 1.35K+ views
Share:
For working professionals
For fresh graduates
More
By Rohit Sharma
Updated on Aug 05, 2025 | 11 min read | 1.35K+ views
Share:
Table of Contents
This project presents a World Happiness Report Analysis in R using the 2019 dataset. We’ve used Google Colab for this project to run the R code.
The analysis looks at the global happiness scores and their relationship with key factors like GDP, social support, life expectancy, and corruption. Through data cleaning, visualization, and correlation analysis, the project identifies trends and patterns that influence national well-being.
Supercharge your data science career with upGrad’s top-tier online data science programs. Master Python, Machine Learning, AI, Tableau, and SQL, taught by industry experts. Begin your journey to the forefront of tech today.
Here’s A Must-Read: Top 25+ R Projects for Beginners to Boost Your Data Science Skills in 2025
Here are the key concepts and basics you should know:
Step into the future with upGrad’s globally recognized programs in Data Science and AI. From foundational certificates to master’s degrees, gain hands-on expertise in Generative AI, Machine Learning, and Advanced Analytics. Apply now and lead the change.
Before starting the analysis, it's important to know the tools and libraries that make this project possible. Each component plays a specific role, making it easier to work with and understand the World Happiness dataset.
Tool / Library |
Purpose |
Google Colab | Cloud-based platform to run R code without needing local setup |
R Language | Programming language used for data analysis and visualization |
tidyverse | Collection of R packages for data wrangling and analysis |
ggplot2 | Visualization library for creating high-quality charts and graphs |
readr | Helps in reading and writing CSV files efficiently |
corrplot | Used to generate correlation matrix visualizations |
To help you plan better, here's a quick overview of the time commitment, complexity, and who this project is best suited for:
Aspect |
Details |
Estimated Duration | 1–2 hours |
Difficulty Level | Beginner-friendly |
Skill Level Needed | Basic understanding of R and data manipulation |
The section below will explain the step-by-step process of building this World Happiness Report Analysis, along with the code and output for each step.
Google Colab runs Python by default, so a quick setup is needed before writing any R code. The good news is, it only takes a few clicks to switch the environment.
To enable R in your notebook:
Must Read R Project: Customer Segmentation Project Using R: A Step-by-Step Guide
Before we can begin analyzing the data, we need to make sure the right tools are in place. R uses packages (similar to plugins) for various functions, like reading files, visualizing data, or creating charts. In this step, you'll install and load all the packages necessary for the project. The code is given below:
# Install the required packages (this needs to be done only once)
install.packages("tidyverse") # Includes packages for data manipulation and visualization
install.packages("ggplot2") # Used to create high-quality visualizations
install.packages("readr") # Helps in reading CSV and other flat files
install.packages("corrplot") # For creating correlation matrix plots
# Load the libraries into your R session
library(tidyverse) # Loads dplyr, ggplot2, readr, etc.
library(ggplot2) # For plotting data
library(readr) # For reading CSV files
library(corrplot) # For plotting correlations
The output for the above code confirms the installation and loading of the libraries and packages:
Installing package into ‘/usr/local/lib/R/site-library’ Installing package into ‘/usr/local/lib/R/site-library’ Installing package into ‘/usr/local/lib/R/site-library’ Installing package into ‘/usr/local/lib/R/site-library’ ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ── |
Popular Data Science Programs
With the packages ready, it's time to bring in the dataset. In this step, you'll read the CSV file into R so you can start exploring and analyzing the data. We'll also take a quick look at the contents and structure. Here’s the code:
# Read the CSV file and store it in a variable called 'happiness'
happiness <- read_csv("Happiness Report - 2019.csv")
# Display the first few rows of the dataset to understand how it looks
head(happiness)
# Show the names of all columns in the dataset
colnames(happiness)
The output of this code gives us an overview of the dataset:
Rows: 156 Columns: 9 ── Column specification ──────────────────────────────────────────────────────── Delimiter: "," chr (1): Country or region dbl (8): Overall rank, Score, GDP per capita, Social support, Healthy life e...
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. |
Overall rank |
Country or region |
Score |
GDP per capita |
Social support |
Healthy life expectancy |
Freedom to make life choices |
Generosity |
Perceptions of corruption |
<dbl> |
<chr> |
<dbl> |
<dbl> |
<dbl> |
<dbl> |
<dbl> |
<dbl> |
<dbl> |
1 |
Finland |
7.769 |
1.340 |
1.587 |
0.986 |
0.596 |
0.153 |
0.393 |
2 |
Denmark |
7.600 |
1.383 |
1.573 |
0.996 |
0.592 |
0.252 |
0.410 |
3 |
Norway |
7.554 |
1.488 |
1.582 |
1.028 |
0.603 |
0.271 |
0.341 |
4 |
Iceland |
7.494 |
1.380 |
1.624 |
1.026 |
0.591 |
0.354 |
0.118 |
5 |
Netherlands |
7.488 |
1.396 |
1.522 |
0.999 |
0.557 |
0.322 |
0.298 |
6 |
Switzerland |
7.480 |
1.452 |
1.526 |
1.052 |
0.572 |
0.263 |
0.343 |
|
Before diving into analysis, it’s important to understand what the dataset contains. This step helps you examine the structure, check basic statistics, and identify any missing values that could affect the results. Here’s the code for this step:
# Check the structure of the dataset: data types and format of each column
str(happiness)
# Get summary statistics: min, max, mean, median for each numeric column
summary(happiness)
# Check for missing values in each column
colSums(is.na(happiness))
The output for this code is:
spc_tbl_ [156 × 9] (S3: spec_tbl_df/tbl_df/tbl/data.frame) $ Overall rank : num [1:156] 1 2 3 4 5 6 7 8 9 10 ... $ Country or region : chr [1:156] "Finland" "Denmark" "Norway" "Iceland" ... $ Score : num [1:156] 7.77 7.6 7.55 7.49 7.49 ... $ GDP per capita : num [1:156] 1.34 1.38 1.49 1.38 1.4 ... $ Social support : num [1:156] 1.59 1.57 1.58 1.62 1.52 ... $ Healthy life expectancy : num [1:156] 0.986 0.996 1.028 1.026 0.999 ... $ Freedom to make life choices: num [1:156] 0.596 0.592 0.603 0.591 0.557 0.572 0.574 0.585 0.584 0.532 ... $ Generosity : num [1:156] 0.153 0.252 0.271 0.354 0.322 0.263 0.267 0.33 0.285 0.244 ... $ Perceptions of corruption : num [1:156] 0.393 0.41 0.341 0.118 0.298 0.343 0.373 0.38 0.308 0.226 ... - attr(*, "spec")= .. cols( .. `Overall rank` = col_double(), .. `Country or region` = col_character(), .. Score = col_double(), .. `GDP per capita` = col_double(), .. `Social support` = col_double(), .. `Healthy life expectancy` = col_double(), .. `Freedom to make life choices` = col_double(), .. Generosity = col_double(), .. `Perceptions of corruption` = col_double() .. ) - attr(*, "problems")=<externalptr>
Overall rank Country or region Score GDP per capita Min. : 1.00 Length:156 Min. :2.853 Min. :0.0000 1st Qu.: 39.75 Class :character 1st Qu.:4.545 1st Qu.:0.6028 Median : 78.50 Mode :character Median :5.380 Median :0.9600 Mean : 78.50 Mean :5.407 Mean :0.9051 3rd Qu.:117.25 3rd Qu.:6.184 3rd Qu.:1.2325 Max. :156.00 Max. :7.769 Max. :1.6840 Social support Healthy life expectancy Freedom to make life choices Min. :0.000 Min. :0.0000 Min. :0.0000 1st Qu.:1.056 1st Qu.:0.5477 1st Qu.:0.3080 Median :1.272 Median :0.7890 Median :0.4170 Mean :1.209 Mean :0.7252 Mean :0.3926 3rd Qu.:1.452 3rd Qu.:0.8818 3rd Qu.:0.5072 Max. :1.624 Max. :1.1410 Max. :0.6310 Generosity Perceptions of corruption Min. :0.0000 Min. :0.0000 1st Qu.:0.1087 1st Qu.:0.0470 Median :0.1775 Median :0.0855 Mean :0.1848 Mean :0.1106 3rd Qu.:0.2482 3rd Qu.:0.1412 Max. :0.5660 Max. :0.4530 Overall rank 0 Country or region 0 Score 0 GDP per capita 0 Social support 0 Healthy life expectancy 0 Freedom to make life choices 0 Generosity 0 Perceptions of corruption 0 |
Here’s a Project in R For You: Car Data Analysis Project Using R
Some of the column names in the dataset are long or contain spaces, which can make them harder to work with in code. This step renames those columns to shorter, cleaner versions for simplicity and readability. Here’s the code:
# Rename long or complex column names to simpler, more usable ones
happiness <- happiness %>%
rename(
Country = `Country or region`,
Score = `Score`,
GDP = `GDP per capita`,
Social = `Social support`,
Life = `Healthy life expectancy`,
Freedom = `Freedom to make life choices`,
Generosity = `Generosity`,
Corruption = `Perceptions of corruption`
)
# Confirm the updated column names
colnames(happiness)
The output for this code is:
|
Now that the data is clean, let’s highlight the countries with the highest happiness scores. This bar plot visually compares the top 10 happiest nations based on their overall score in the 2019 report. Here’s the code:
# Select and sort the top 10 countries with the highest happiness scores
top10 <- happiness %>% arrange(desc(Score)) %>% head(10)
# Create a horizontal bar chart to display the top 10 happiest countries
ggplot(top10, aes(x = reorder(Country, Score), y = Score, fill = Country)) +
geom_bar(stat = "identity") + # Draw bars based on the Score
coord_flip() + # Flip the chart to make it horizontal
labs(title = "Top 10 Happiest Countries in 2019",
x = "Country",
y = "Happiness Score") +
theme_minimal() # Use a clean, minimal chart theme
The output for the above code gives us a graph of the top 10 happiest countries in 2019:
To understand how different factors (like GDP, social support, and life expectancy) relate to happiness, we’ll create a correlation matrix. This step helps us see which variables are positively or negatively associated with the happiness score. Here’s the code:
# Select only the numeric columns needed for correlation analysis
numeric_data <- happiness %>% select(Score, GDP, Social, Life, Freedom, Generosity, Corruption)
# Calculate the correlation matrix between selected numeric columns
cor_matrix <- cor(numeric_data)
# Visualize the correlation matrix using color-coded blocks and numeric values
corrplot(cor_matrix, method = "color", # Use colored squares to show strength of correlation
type = "upper", # Show only the upper triangle of the matrix
addCoef.col = "black", # Add correlation values in black text
tl.cex = 0.8) # Adjust label size
The output gives a correlation plot between various happiness factors:
The above plot means that:
Build This Interesting R Project: Wine Quality Prediction Project in R
Now let’s look into one of the strongest correlations we saw earlier, between GDP per capita and happiness score. In this step, we’ll use a scatter plot with a trend line to show how happiness changes with economic prosperity. Here’s the code for this step:
# Create a scatter plot to visualize the relationship between GDP and happiness score
ggplot(happiness, aes(x = GDP, y = Score)) +
geom_point(color = "blue") + # Plot each country as a blue dot
geom_smooth(method = "lm", se = FALSE, color = "red") + # Add a red trend line without confidence band
labs(title = "Happiness Score vs GDP per Capita",
x = "GDP per Capita",
y = "Happiness Score") +
theme_minimal() # Use a clean chart style
The output for the above step gives us a graphical representation of the Happiness Score vs GDP Per Capita
The above graph shows that:
In this World Happiness Report Analysis in R, we explored the 2019 dataset using Google Colab to understand how various factors impact a country's happiness score.
After cleaning and simplifying the data, we visualized top-performing countries and examined correlations between variables such as GDP, social support, life expectancy, and perceptions of corruption.
Our analysis revealed strong positive relationships between happiness and GDP, social support, and life expectancy. A scatter plot confirmed GDP as a key influencer of happiness.
Unlock the power of data with our popular Data Science courses, designed to make you proficient in analytics, machine learning, and big data!
Elevate your career by learning essential Data Science skills such as statistical modeling, big data processing, predictive analytics, and SQL!
Stay informed and inspired with our popular Data Science articles, offering expert insights, trends, and practical tips for aspiring data professionals!
Colab Link:
https://colab.research.google.com/drive/10WLUd3o68yQuFvViEqL5nYS2Wmv_NiNA#scrollTo=5whGKTOtjcRL
823 articles published
Rohit Sharma is the Head of Revenue & Programs (International), with over 8 years of experience in business analytics, EdTech, and program management. He holds an M.Tech from IIT Delhi and specializes...
Speak with Data Science Expert
By submitting, I accept the T&C and
Privacy Policy
Start Your Career in Data Science Today
Top Resources