World Happiness Report Analysis in R With Code

By Rohit Sharma

Updated on Aug 05, 2025 | 11 min read | 1.35K+ views

Share:

This project presents a World Happiness Report Analysis in R using the 2019 dataset. We’ve used Google Colab for this project to run the R code. 

The analysis looks at the global happiness scores and their relationship with key factors like GDP, social support, life expectancy, and corruption. Through data cleaning, visualization, and correlation analysis, the project identifies trends and patterns that influence national well-being. 

Supercharge your data science career with upGrad’s top-tier online data science programs. Master Python, Machine Learning, AI, Tableau, and SQL, taught by industry experts. Begin your journey to the forefront of tech today.

Here’s A Must-Read: Top 25+ R Projects for Beginners to Boost Your Data Science Skills in 2025

What You Must Understand Before Starting the World Happiness Report Analysis in R

Here are the key concepts and basics you should know:

  • Basic R Syntax: Understand how to write and run R code, especially in Google Colab.
  • Working with Data Frames: Learn how to load, view, and manipulate tabular data.
  • Data Cleaning: Know how to check for and handle missing or inconsistent data.
  • Exploratory Data Analysis (EDA): Be familiar with summarizing and visualizing data to uncover insights.
  • Correlation Analysis: Understand how different variables relate to each other statistically.
  • Plotting with ggplot2: Learn how to create clear and informative visualizations using this powerful R package.

Step into the future with upGrad’s globally recognized programs in Data Science and AI. From foundational certificates to master’s degrees, gain hands-on expertise in Generative AI, Machine Learning, and Advanced Analytics. Apply now and lead the change.

The Tools and R Libraries Used In This Analysis

Before starting the analysis, it's important to know the tools and libraries that make this project possible. Each component plays a specific role, making it easier to work with and understand the World Happiness dataset.

Tool / Library

Purpose

Google Colab Cloud-based platform to run R code without needing local setup
R Language Programming language used for data analysis and visualization
tidyverse Collection of R packages for data wrangling and analysis
ggplot2 Visualization library for creating high-quality charts and graphs
readr Helps in reading and writing CSV files efficiently
corrplot Used to generate correlation matrix visualizations

How Long Does It Take and How Hard Is This Project

To help you plan better, here's a quick overview of the time commitment, complexity, and who this project is best suited for:

Aspect

Details

Estimated Duration 1–2 hours
Difficulty Level Beginner-friendly
Skill Level Needed Basic understanding of R and data manipulation

A Breakdown of the World Happiness Report Analysis Project Using R

The section below will explain the step-by-step process of building this World Happiness Report Analysis, along with the code and output for each step.

Step 1: Getting R Ready in Google Colab

Google Colab runs Python by default, so a quick setup is needed before writing any R code. The good news is, it only takes a few clicks to switch the environment.

To enable R in your notebook:

  • Launch a new notebook at Google Colab
  • Navigate to the top menu and select Runtime
  • Choose Change runtime type
  • In the dialog that appears, switch the language to R
  • Click Save, and you're all set to start coding in R

Must Read R Project: Customer Segmentation Project Using R: A Step-by-Step Guide

Step 2: Install and Load the Required R Packages

Before we can begin analyzing the data, we need to make sure the right tools are in place. R uses packages (similar to plugins) for various functions, like reading files, visualizing data, or creating charts. In this step, you'll install and load all the packages necessary for the project. The code is given below:

# Install the required packages (this needs to be done only once)
install.packages("tidyverse")   # Includes packages for data manipulation and visualization
install.packages("ggplot2")     # Used to create high-quality visualizations
install.packages("readr")       # Helps in reading CSV and other flat files
install.packages("corrplot")    # For creating correlation matrix plots

# Load the libraries into your R session
library(tidyverse)   # Loads dplyr, ggplot2, readr, etc.
library(ggplot2)     # For plotting data
library(readr)       # For reading CSV files
library(corrplot)    # For plotting correlations

The output for the above code confirms the installation and loading of the libraries and packages:

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr    1.1.4     ✔ readr    2.1.5
✔ forcats  1.0.0     ✔ stringr  1.5.1
✔ ggplot2  3.5.2     ✔ tibble   3.3.0
✔ lubridate 1.9.4     ✔ tidyr    1.3.1
✔ purrr    1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
corrplot 0.95 loaded

Step 3: Load the World Happiness Dataset into R

With the packages ready, it's time to bring in the dataset. In this step, you'll read the CSV file into R so you can start exploring and analyzing the data. We'll also take a quick look at the contents and structure. Here’s the code:

# Read the CSV file and store it in a variable called 'happiness'
happiness <- read_csv("Happiness Report - 2019.csv")

# Display the first few rows of the dataset to understand how it looks
head(happiness)

# Show the names of all columns in the dataset
colnames(happiness)

The output of this code gives us an overview of the dataset:

Rows: 156 Columns: 9

── Column specification ────────────────────────────────────────────────────────

Delimiter: ","

chr (1): Country or region

dbl (8): Overall rank, Score, GDP per capita, Social support, Healthy life e...

 

Use `spec()` to retrieve the full column specification for this data.

Specify the column types or set `show_col_types = FALSE` to quiet this message.

 

Overall rank

Country or region

Score

GDP per capita

Social support

Healthy life expectancy

Freedom to make life choices

Generosity

Perceptions of corruption

<dbl>

<chr>

<dbl>

<dbl>

<dbl>

<dbl>

<dbl>

<dbl>

<dbl>

1

Finland

7.769

1.340

1.587

0.986

0.596

0.153

0.393

2

Denmark

7.600

1.383

1.573

0.996

0.592

0.252

0.410

3

Norway

7.554

1.488

1.582

1.028

0.603

0.271

0.341

4

Iceland

7.494

1.380

1.624

1.026

0.591

0.354

0.118

5

Netherlands

7.488

1.396

1.522

0.999

0.557

0.322

0.298

6

Switzerland

7.480

1.452

1.526

1.052

0.572

0.263

0.343

 

  • 'Overall rank'
  • 'Country or region'
  • 'Score'
  • 'GDP per capita'
  • 'Social support'
  • 'Healthy life expectancy'
  • 'Freedom to make life choices'
  • 'Generosity'
  • 'Perceptions of corruption'

Step 4: Explore the Structure and Quality of the Data

Before diving into analysis, it’s important to understand what the dataset contains. This step helps you examine the structure, check basic statistics, and identify any missing values that could affect the results. Here’s the code for this step:

# Check the structure of the dataset: data types and format of each column
str(happiness)

# Get summary statistics: min, max, mean, median for each numeric column
summary(happiness)

# Check for missing values in each column
colSums(is.na(happiness))

background

Liverpool John Moores University

MS in Data Science

Double Credentials

Master's Degree17 Months

Placement Assistance

Certification6 Months

The output for this code is:

spc_tbl_ [156 × 9] (S3: spec_tbl_df/tbl_df/tbl/data.frame)

 $ Overall rank                : num [1:156] 1 2 3 4 5 6 7 8 9 10 ...

 $ Country or region           : chr [1:156] "Finland" "Denmark" "Norway" "Iceland" ...

 $ Score                       : num [1:156] 7.77 7.6 7.55 7.49 7.49 ...

 $ GDP per capita              : num [1:156] 1.34 1.38 1.49 1.38 1.4 ...

 $ Social support              : num [1:156] 1.59 1.57 1.58 1.62 1.52 ...

 $ Healthy life expectancy     : num [1:156] 0.986 0.996 1.028 1.026 0.999 ...

 $ Freedom to make life choices: num [1:156] 0.596 0.592 0.603 0.591 0.557 0.572 0.574 0.585 0.584 0.532 ...

 $ Generosity                  : num [1:156] 0.153 0.252 0.271 0.354 0.322 0.263 0.267 0.33 0.285 0.244 ...

 $ Perceptions of corruption   : num [1:156] 0.393 0.41 0.341 0.118 0.298 0.343 0.373 0.38 0.308 0.226 ...

 - attr(*, "spec")=

  .. cols(

  ..   `Overall rank` = col_double(),

  ..   `Country or region` = col_character(),

  ..   Score = col_double(),

  ..   `GDP per capita` = col_double(),

  ..   `Social support` = col_double(),

  ..   `Healthy life expectancy` = col_double(),

  ..   `Freedom to make life choices` = col_double(),

  ..   Generosity = col_double(),

  ..   `Perceptions of corruption` = col_double()

  .. )

 - attr(*, "problems")=<externalptr> 

 

 Overall rank    Country or region      Score       GDP per capita  

 Min.   :  1.00   Length:156         Min.   :2.853   Min.   :0.0000  

 1st Qu.: 39.75   Class :character   1st Qu.:4.545   1st Qu.:0.6028  

 Median : 78.50   Mode  :character   Median :5.380   Median :0.9600  

 Mean   : 78.50                      Mean   :5.407   Mean   :0.9051  

 3rd Qu.:117.25                      3rd Qu.:6.184   3rd Qu.:1.2325  

 Max.   :156.00                      Max.   :7.769   Max.   :1.6840  

 Social support  Healthy life expectancy Freedom to make life choices

 Min.   :0.000   Min.   :0.0000          Min.   :0.0000              

 1st Qu.:1.056   1st Qu.:0.5477          1st Qu.:0.3080              

 Median :1.272   Median :0.7890          Median :0.4170              

 Mean   :1.209   Mean   :0.7252          Mean   :0.3926              

 3rd Qu.:1.452   3rd Qu.:0.8818          3rd Qu.:0.5072              

 Max.   :1.624   Max.   :1.1410          Max.   :0.6310              

   Generosity     Perceptions of corruption

 Min.   :0.0000   Min.   :0.0000           

 1st Qu.:0.1087   1st Qu.:0.0470           

 Median :0.1775   Median :0.0855           

 Mean   :0.1848   Mean   :0.1106           

 3rd Qu.:0.2482   3rd Qu.:0.1412           

 Max.   :0.5660   Max.   :0.4530          

Overall rank 0 Country or region 0 Score 0 GDP per capita 0 Social support 0 Healthy life expectancy 0 Freedom to make life choices 0 Generosity 0 Perceptions of corruption 0

 Here’s a Project in R For You: Car Data Analysis Project Using R

Step 5: Simplify Column Names for Easier Use

Some of the column names in the dataset are long or contain spaces, which can make them harder to work with in code. This step renames those columns to shorter, cleaner versions for simplicity and readability. Here’s the code:

# Rename long or complex column names to simpler, more usable ones
happiness <- happiness %>%
  rename(
    Country = `Country or region`,
    Score = `Score`,
    GDP = `GDP per capita`,
    Social = `Social support`,
    Life = `Healthy life expectancy`,
    Freedom = `Freedom to make life choices`,
    Generosity = `Generosity`,
    Corruption = `Perceptions of corruption`
  )

# Confirm the updated column names
colnames(happiness)

The output for this code is:

  • 'Overall rank'
  • 'Country'
  • 'Score'
  • 'GDP'
  • 'Social'
  • 'Life'
  • 'Freedom'
  • 'Generosity'
  • 'Corruption'

Step 6: Visualize the Top 10 Happiest Countries

Now that the data is clean, let’s highlight the countries with the highest happiness scores. This bar plot visually compares the top 10 happiest nations based on their overall score in the 2019 report. Here’s the code:

# Select and sort the top 10 countries with the highest happiness scores
top10 <- happiness %>% arrange(desc(Score)) %>% head(10)

# Create a horizontal bar chart to display the top 10 happiest countries
ggplot(top10, aes(x = reorder(Country, Score), y = Score, fill = Country)) +
  geom_bar(stat = "identity") +            # Draw bars based on the Score
  coord_flip() +                           # Flip the chart to make it horizontal
  labs(title = "Top 10 Happiest Countries in 2019", 
       x = "Country", 
       y = "Happiness Score") +
  theme_minimal()                          # Use a clean, minimal chart theme

The output for the above code gives us a graph of the top 10 happiest countries in 2019:

Step 7: Analyze Correlations Between Happiness Factors

To understand how different factors (like GDP, social support, and life expectancy) relate to happiness, we’ll create a correlation matrix. This step helps us see which variables are positively or negatively associated with the happiness score. Here’s the code:

# Select only the numeric columns needed for correlation analysis
numeric_data <- happiness %>% select(Score, GDP, Social, Life, Freedom, Generosity, Corruption)

# Calculate the correlation matrix between selected numeric columns
cor_matrix <- cor(numeric_data)

# Visualize the correlation matrix using color-coded blocks and numeric values
corrplot(cor_matrix, method = "color",     # Use colored squares to show strength of correlation
         type = "upper",                   # Show only the upper triangle of the matrix
         addCoef.col = "black",            # Add correlation values in black text
         tl.cex = 0.8)                     # Adjust label size

The output gives a correlation plot between various happiness factors:

The above plot means that:

  • Happiness is strongly linked to GDP, social support, and life expectancy; countries with higher values in these areas tend to have higher happiness scores.
  • Freedom and low corruption also have a noticeable positive effect on happiness, but not as strong as GDP or social support.
  • Generosity has a very weak connection to overall happiness in this dataset.
  • Darker blue boxes mean stronger relationships between factors, while lighter ones show weaker or no relationship.

Build This Interesting R Project: Wine Quality Prediction Project in R

Step 8: Visualize the Relationship Between GDP and Happiness Score

Now let’s look into one of the strongest correlations we saw earlier, between GDP per capita and happiness score. In this step, we’ll use a scatter plot with a trend line to show how happiness changes with economic prosperity. Here’s the code for this step:

# Create a scatter plot to visualize the relationship between GDP and happiness score
ggplot(happiness, aes(x = GDP, y = Score)) +
  geom_point(color = "blue") +                # Plot each country as a blue dot
  geom_smooth(method = "lm", se = FALSE, color = "red") +  # Add a red trend line without confidence band
  labs(title = "Happiness Score vs GDP per Capita", 
       x = "GDP per Capita", 
       y = "Happiness Score") +
  theme_minimal()                            # Use a clean chart style

The output for the above step gives us a graphical representation of the Happiness Score vs GDP Per Capita

The above graph shows that:

  • There is a clear upward trend: as GDP per capita increases, the happiness score also tends to increase.
  • The red line confirms a strong positive linear relationship between economic wealth and happiness.
  • Most points cluster around the trend line, showing that GDP is a reliable predictor of happiness in this dataset.

Conclusion

In this World Happiness Report Analysis in R, we explored the 2019 dataset using Google Colab to understand how various factors impact a country's happiness score. 

After cleaning and simplifying the data, we visualized top-performing countries and examined correlations between variables such as GDP, social support, life expectancy, and perceptions of corruption.

Our analysis revealed strong positive relationships between happiness and GDP, social support, and life expectancy. A scatter plot confirmed GDP as a key influencer of happiness.

Unlock the power of data with our popular Data Science courses, designed to make you proficient in analytics, machine learning, and big data!

Elevate your career by learning essential Data Science skills such as statistical modeling, big data processing, predictive analytics, and SQL!

Stay informed and inspired with our popular Data Science articles, offering expert insights, trends, and practical tips for aspiring data professionals!

Colab Link:
https://colab.research.google.com/drive/10WLUd3o68yQuFvViEqL5nYS2Wmv_NiNA#scrollTo=5whGKTOtjcRL

Frequently Asked Questions (FAQs)

1. What is the World Happiness Report Analysis in R all about?

2. Which tools and libraries are required to run this project in R?

3. Can this analysis be extended with machine learning algorithms?

4. What are some beginner-friendly R projects similar to this one?

5. Is Google Colab suitable for running R projects?

Rohit Sharma

823 articles published

Rohit Sharma is the Head of Revenue & Programs (International), with over 8 years of experience in business analytics, EdTech, and program management. He holds an M.Tech from IIT Delhi and specializes...

Speak with Data Science Expert

+91

By submitting, I accept the T&C and
Privacy Policy

Start Your Career in Data Science Today

Top Resources

Recommended Programs

IIIT Bangalore logo
bestseller

The International Institute of Information Technology, Bangalore

Executive Diploma in Data Science & AI

360° Career Support

Executive PG Program

12 Months

Liverpool John Moores University Logo
bestseller

Liverpool John Moores University

MS in Data Science

Double Credentials

Master's Degree

17 Months

upGrad Logo

Certification

3 Months