COVID-19 Project: Data Visualization & Insights
By Rohit Sharma
Updated on Jul 24, 2025 | 17 min read | 1.29K+ views
Share:
For working professionals
For fresh graduates
More
By Rohit Sharma
Updated on Jul 24, 2025 | 17 min read | 1.29K+ views
Share:
Table of Contents
The COVID-19 pandemic touched every region of the globe, but behind the news were enormous quantities of data. In this project, we will perform COVID-19 data analysis
and going to make that data tangible.
If you're a data science newcomer or just want to tune up your skills, this blog will guide you through working with actual public health data, visualizing trends with interactive graphs, and even tracing the virus's spread across regions.
Popular Data Science Programs
Spark your next big idea. Browse our full collection of data science projects in Python.
It’s helpful to have some basic knowledge of the following before starting this project:
Start your journey of career advancement in data science with upGrad’s top-ranked courses and get a chance to learn from industry-established mentors:
For this COVID-19 Project, the following tools and libraries will be used:
Skill Area |
Purpose |
Python Programming | You'll be writing Python code to clean, explore, and visualize data. |
Pandas & NumPy | These libraries help in cleaning and analyzing large datasets efficiently. |
Matplotlib & Seaborn | Useful for quick visualizations and exploring trends in the data. |
Plotly | This project uses Plotly to build rich, interactive dashboards. |
Geospatial Tools (Folium / GeoPandas) | Used for mapping how the virus spread across regions or countries. |
Jupyter/Colab Environment (Optional) | Makes it easier to test code and view visualizations inline. |
In this COVID-19 Project, you're not predicting future trends, but instead learning to explore, analyze, and visualize real-world health data effectively using the following tools and techniques:
You can complete the COVID-19 project in 2 to 3 hours. It’s a beginner-friendly yet impactful project that helps you learn how to work with real-world public health data, create insightful visualizations, and build interactive charts and maps using Python.
Let’s start building the project from scratch. We'll go step-by-step through the process of:
Without any further delay, let’s get started!
To build our COVID-19 Project, we’ll use a publicly available dataset from Kaggle. This dataset includes real-world COVID-19 statistics such as daily confirmed cases, deaths, recoveries, and testing rates across different countries and periods.
Follow the steps below to download the dataset:
Now that you’ve downloaded the dataset, let’s move on to the next step, uploading and loading it into Google Colab.
Now that you have downloaded both files, upload them to Google Colab using the code below:
from google.colab import files
uploaded = files.upload()
Once uploaded, import the required libraries and use the following Python code to read and check the data:
# Import all necessary libraries for data analysis and visualization
import pandas as pd # For data manipulation
import numpy as np # For numerical operations
import matplotlib.pyplot as plt # For basic plotting
import seaborn as sns # For statistical visualizations
import plotly.express as px # For interactive visualizations
import plotly.graph_objects as go # For custom interactive plots
import plotly.offline as pyo # For offline plotting
from plotly.subplots import make_subplots # For multiple subplots
# Install required packages if not already installed
# Run these in separate cells if needed:
# !pip install plotly
# !pip install folium
# !pip install geopandas
# Load the COVID-19 dataset
df = pd.read_csv('country_wise_latest.csv')
# Display basic information about the dataset
print("\nFirst 5 rows:")
print(df.head())
Output :
First 5 rows:
Country/Region Confirmed Deaths Recovered Active New cases New deaths \
0 Afghanistan 36263 1269 25198 9796 106 10
1 Albania 4880 144 2745 1991 117 6
2 Algeria 27973 1163 18837 7973 616 8
3 Andorra 907 52 803 52 10 0
4 Angola 950 41 242 667 18 1
New recovered Deaths / 100 Cases Recovered / 100 Cases \
0 18 3.50 69.49
1 63 2.95 56.25
2 749 4.16 67.34
3 0 5.73 88.53
4 0 4.32 25.47
Deaths / 100 Recovered Confirmed last week 1 week change \
0 5.04 35526 737
1 5.25 4171 709
2 6.17 23691 4282
3 6.48 884 23
4 16.94 749 201
1 week % increase WHO Region
0 2.07 Eastern Mediterranean
1 17.00 Europe
2 18.07 Africa
3 2.60 Europe
4 26.84 Africa
Before visualizing COVID-19 data, it’s crucial to clean and prepare the dataset. This includes handling missing values, calculating useful metrics like death and recovery rates, and formatting country names for mapping and analysis.
Here is the code:
# Step 1: Check for missing values
print("Missing values in each column:")
print(df.isnull().sum())
# Step 2: Handle missing values and correct data types
# Replace any missing values in key numerical columns with 0
numerical_cols = ['Confirmed', 'Deaths', 'Recovered', 'Active', 'New cases',
'New deaths', 'New recovered']
for col in numerical_cols:
df[col] = df[col].fillna(0)
# Step 3: Create additional calculated columns for deeper analysis
# Calculate death rate as a percentage of confirmed cases
df['Death_Rate'] = (df['Deaths'] / df['Confirmed']) * 100
# Calculate recovery rate as a percentage of confirmed cases
df['Recovery_Rate'] = (df['Recovered'] / df['Confirmed']) * 100
# Calculate active case rate as a percentage of confirmed cases
df['Active_Rate'] = (df['Active'] / df['Confirmed']) * 100
# Step 4: Clean country names for mapping compatibility
# Remove asterisks (if any) from country names
df['Country/Region'] = df['Country/Region'].str.replace('*', '', regex=False)
# Display a preview of the cleaned and enriched dataset
print("Cleaned Dataset Info:")
print(df[['Country/Region', 'Confirmed', 'Deaths', 'Recovered', 'Death_Rate', 'Recovery_Rate']].head())
Output:
Column Name |
Missing Values |
Country/Region | 0 |
Confirmed | 0 |
Deaths | 0 |
Recovered | 0 |
Active | 0 |
New cases | 0 |
New deaths | 0 |
New recovered | 0 |
Deaths / 100 Cases | 0 |
Recovered / 100 Cases | 0 |
Deaths / 100 Recovered | 0 |
Confirmed last week | 0 |
1 week change | 0 |
1 week % increase | 0 |
WHO Region | 0 |
Cleaned Data Preview :
Country/Region |
Confirmed |
Deaths |
Recovered |
Death_Rate (%) |
Recovery_Rate (%) |
Afghanistan | 36,263 | 1,269 | 25,198 | 3.50 | 69.49 |
Albania | 4,880 | 144 | 2,745 | 2.95 | 56.25 |
Algeria | 27,973 | 1,163 | 18,837 | 4.16 | 67.34 |
Andorra | 907 | 52 | 803 | 5.73 | 88.53 |
Angola | 950 | 41 | 242 | 4.32 | 25.47 |
Now that our Real-world health project data is clean and sorted, our very first task is to perform EDA.
In this step, we’ll perform a comprehensive visual analysis of the COVID-19 dataset to uncover global trends. We’ll examine the countries most affected, distribution of recovery rates, and the relationship between confirmed cases and death rate.
Here is the code:
# Create comprehensive exploratory analysis using visualizations
# Set the figure size
plt.figure(figsize=(15, 8))
# 1. Top 10 countries by confirmed COVID-19 cases
top_10_confirmed = df.nlargest(10, 'Confirmed')
plt.subplot(2, 2, 1)
plt.barh(top_10_confirmed['Country/Region'], top_10_confirmed['Confirmed'], color='red', alpha=0.7)
plt.title('Top 10 Countries by Confirmed Cases')
plt.xlabel('Confirmed Cases')
# 2. Top 10 countries by COVID-19 deaths
top_10_deaths = df.nlargest(10, 'Deaths')
plt.subplot(2, 2, 2)
plt.barh(top_10_deaths['Country/Region'], top_10_deaths['Deaths'], color='black', alpha=0.7)
plt.title('Top 10 Countries by Deaths')
plt.xlabel('Deaths')
# 3. Histogram showing the distribution of recovery rates across countries
plt.subplot(2, 2, 3)
plt.hist(df['Recovery_Rate'].dropna(), bins=30, color='green', alpha=0.7, edgecolor='black')
plt.title('Distribution of Recovery Rates')
plt.xlabel('Recovery Rate (%)')
plt.ylabel('Frequency')
# 4. Scatter plot comparing confirmed cases and death rate
plt.subplot(2, 2, 4)
plt.scatter(df['Confirmed'], df['Death_Rate'], alpha=0.6, color='purple')
plt.title('Death Rate vs Confirmed Cases')
plt.xlabel('Confirmed Cases')
plt.ylabel('Death Rate (%)')
plt.xscale('log') # Log scale to handle wide range of confirmed cases
# Adjust layout to prevent overlap
plt.tight_layout()
plt.show()
# Print key global statistics
print("Global COVID-19 Statistics:")
print(f"Total Confirmed Cases: {df['Confirmed'].sum():,}")
print(f"Total Deaths: {df['Deaths'].sum():,}")
print(f"Total Recovered: {df['Recovered'].sum():,}")
print(f"Global Death Rate: {(df['Deaths'].sum() / df['Confirmed'].sum() * 100):.2f}%")
print(f"Global Recovery Rate: {(df['Recovered'].sum() / df['Confirmed'].sum() * 100):.2f}%")
Output:
Global COVID-19 Statistics: Total Confirmed Cases: 16,480,485 Total Deaths: 654,036 Total Recovered: 9,468,087 Global Death Rate: 3.97% Global Recovery Rate: 57.45% |
In this section, we’ll create interactive visualizations using Plotly to explore COVID-19 data across countries and WHO regions. These dynamic charts help reveal deeper insights such as country-wise confirmed cases and how recovery and death rates vary globally.
Here is the code:
# Create interactive visualizations for dynamic COVID-19 data insights
# 1. Interactive bar chart for the top 20 countries with highest confirmed cases
top_20_countries = df.nlargest(20, 'Confirmed') # Get top 20 rows with highest confirmed cases
fig1 = px.bar(
top_20_countries, # Data to plot
x='Country/Region', # Countries on X-axis
y='Confirmed', # Confirmed cases on Y-axis
color='Deaths', # Color bar by number of deaths
title='Top 20 Countries by Confirmed COVID-19 Cases',
hover_data=['Deaths', 'Recovered', 'Active'], # Extra info when hovering over bars
color_continuous_scale='Reds' # Red color gradient for deaths
)
# Customize layout for better readability
fig1.update_layout(
xaxis_tickangle=-45, # Rotate country names on X-axis
height=600,
xaxis_title="Country",
yaxis_title="Confirmed Cases"
)
# Show the interactive bar chart
fig1.show()
# 2. Interactive scatter plot comparing Recovery Rate vs Death Rate
fig2 = px.scatter(
df, # Full dataset
x='Recovery_Rate', # Recovery rate on X-axis
y='Death_Rate', # Death rate on Y-axis
size='Confirmed', # Bubble size indicates total confirmed cases
color='WHO Region', # Color based on WHO region
hover_name='Country/Region', # Show country name on hover
hover_data=['Confirmed', 'Deaths', 'Recovered'], # Extra info on hover
title='COVID-19: Death Rate vs Recovery Rate by WHO Region',
labels={
'Recovery_Rate': 'Recovery Rate (%)',
'Death_Rate': 'Death Rate (%)'
}
)
# Adjust chart height for better display
fig2.update_layout(height=600)
# Show the interactive scatter plot
fig2.show()
Output:
Note- The charts above are originally interactive and dynamic. However, they are shown here as static images for display purposes. In a real project or dashboard, you can hover, zoom, and filter data directly on these plots for deeper exploration.
In this section, we use Plotly to build interactive visualizations for deeper insights into COVID-19 trends by WHO regions and global distribution. These Bar charts help users visually compare the confirmed, death, and recovery numbers across various regions and countries.
Here is the Code:
# Create interactive multi-metric visualization
# Using the WHO region data for comparative analysis
# 3. Regional analysis - Aggregate key metrics by WHO Region
# We'll sum up total Confirmed, Deaths, Recovered, and Active cases for each region
regional_data = df.groupby('WHO Region').agg({
'Confirmed': 'sum',
'Deaths': 'sum',
'Recovered': 'sum',
'Active': 'sum'
}).reset_index()
# Create a multi-bar chart using Plotly's go.Figure
fig3 = go.Figure()
# Add bar trace for confirmed cases
fig3.add_trace(go.Bar(
name='Confirmed',
x=regional_data['WHO Region'],
y=regional_data['Confirmed'],
marker_color='blue'
))
# Add bar trace for deaths
fig3.add_trace(go.Bar(
name='Deaths',
x=regional_data['WHO Region'],
y=regional_data['Deaths'],
marker_color='red'
))
# Add bar trace for recovered
fig3.add_trace(go.Bar(
name='Recovered',
x=regional_data['WHO Region'],
y=regional_data['Recovered'],
marker_color='green'
))
# Customize the layout
fig3.update_layout(
title='COVID-19 Cases by WHO Region',
xaxis_title='WHO Region',
yaxis_title='Number of Cases',
barmode='group', # Group bars next to each other
height=600,
hovermode='x unified' # Unified tooltip on hover for better comparison
)
# Show the interactive chart
fig3.show()
# 4. Interactive pie chart for global distribution of confirmed cases (Top 10 countries)
# We use top_10_confirmed (already defined earlier using df.nlargest(10, 'Confirmed'))
fig4 = px.pie(
top_10_confirmed,
values='Confirmed',
names='Country/Region',
title='Global COVID-19 Cases Distribution (Top 10 Countries)',
hover_data=['Deaths', 'Recovered'] # Show more info when hovered
)
# Customize labels and layout
fig4.update_traces(
textposition='inside',
textinfo='percent+label' # Show both percent and country name
)
# Display the interactive pie chart
fig4.show()
Output:
Note- The charts above are originally interactive and dynamic. However, they are shown here as static images for display purposes. In a real project or dashboard, you can hover, zoom, and filter data directly on these plots for deeper exploration.
In this final step, we create an interactive dashboard that combines multiple visualizations into a single layout. Using Plotly subplots, this dashboard helps analyze the pandemic's impact by region, country, and rate metrics—all at once.
Here is the Code:
# Create a comprehensive dashboard with multiple subplots
# This combines bar, scatter, and pie charts into a single interactive layout
from plotly.subplots import make_subplots
import plotly.graph_objects as go
# Initialize subplot layout with 2 rows × 2 columns
# Each cell will host a different type of chart
fig5 = make_subplots(
rows=2, cols=2,
subplot_titles=(
'Cases by Region',
'Top 10 Countries (Sample: Top 5)',
'Death vs Recovery Rate',
'Case Distribution (Top 5 Countries)'
),
specs=[[{"type": "bar"}, {"type": "bar"}], # Row 1: bar charts
[{"type": "scatter"}, {"type": "pie"}]] # Row 2: scatter and pie
)
# Subplot 1: Bar chart - Total confirmed cases by WHO Region
fig5.add_trace(
go.Bar(
x=regional_data['WHO Region'],
y=regional_data['Confirmed'],
name='Confirmed by Region'
),
row=1, col=1
)
# Subplot 2: Bar chart - Top 5 countries with most confirmed cases
fig5.add_trace(
go.Bar(
x=top_10_confirmed['Country/Region'][:5],
y=top_10_confirmed['Confirmed'][:5],
name='Top 5 Countries'
),
row=1, col=2
)
# Subplot 3: Scatter plot - Death Rate vs Recovery Rate
fig5.add_trace(
go.Scatter(
x=df['Recovery_Rate'],
y=df['Death_Rate'],
mode='markers',
text=df['Country/Region'], # Country name as hover text
marker=dict(
size=df['Confirmed'] / 10000, # Bubble size based on confirmed cases
color='red',
opacity=0.6
),
name='Death vs Recovery'
),
row=2, col=1
)
# Subplot 4: Pie chart - Distribution of confirmed cases (Top 5 countries)
fig5.add_trace(
go.Pie(
labels=top_10_confirmed['Country/Region'][:5],
values=top_10_confirmed['Confirmed'][:5],
name='Distribution'
),
row=2, col=2
)
# Final layout adjustments
fig5.update_layout(
height=800,
showlegend=False, # Hide legend for cleaner look
title_text="COVID-19 Comprehensive Dashboard"
)
# Display the dashboard
fig5.show()
Output:
Note- This interactive dashboard allows dynamic zoom, pan, and hover interactions. It’s ideal for use in Jupyter/Colab notebooks, Dash apps, or Streamlit.
Here you're seeing a static version; it may not reflect the true interactivity.
To visualize the global spread of COVID-19, we use choropleth maps that color countries based on the number of confirmed cases and death rates. For this, we manually map country names to their ISO-3 codes, which Plotly uses to identify countries.
Here is the code:
# Create interactive world map showing COVID-19 spread
# Choropleth maps require ISO 3-letter country codes
# Mapping country names to their ISO codes (for visualization)
country_codes = {
'US': 'USA', 'Brazil': 'BRA', 'India': 'IND', 'Russia': 'RUS', 'Peru': 'PER',
'Chile': 'CHL', 'United Kingdom': 'GBR', 'Iran': 'IRN', 'Germany': 'DEU', 'Turkey': 'TUR',
'Bangladesh': 'BGD', 'France': 'FRA', 'Saudi Arabia': 'SAU', 'Italy': 'ITA', 'Pakistan': 'PAK',
'Spain': 'ESP', 'Mexico': 'MEX', 'South Africa': 'ZAF', 'Canada': 'CAN', 'Qatar': 'QAT',
'China': 'CHN', 'Egypt': 'EGY', 'Sweden': 'SWE', 'Belarus': 'BLR', 'Belgium': 'BEL',
'Ecuador': 'ECU', 'Kazakhstan': 'KAZ', 'Indonesia': 'IDN', 'UAE': 'ARE', 'Portugal': 'PRT',
'Netherlands': 'NLD', 'Singapore': 'SGP', 'Kuwait': 'KWT', 'Ukraine': 'UKR', 'Philippines': 'PHL',
'Argentina': 'ARG', 'Afghanistan': 'AFG', 'Japan': 'JPN', 'Poland': 'POL', 'Romania': 'ROU',
'Israel': 'ISR', 'Switzerland': 'CHE', 'Thailand': 'THA', 'Armenia': 'ARM', 'Nigeria': 'NGA',
'Bahrain': 'BHR', 'Iraq': 'IRQ', 'Azerbaijan': 'AZE', 'Dominican Republic': 'DOM', 'Panama': 'PAN',
'Bolivia': 'BOL', 'Ireland': 'IRL', 'South Korea': 'KOR', 'Austria': 'AUT', 'Serbia': 'SRB',
'Oman': 'OMN', 'Czech Republic': 'CZE', 'Moldova': 'MDA', 'Denmark': 'DNK', 'Guatemala': 'GTM'
}
# Add ISO-3 codes as a new column for mapping
df['iso_code'] = df['Country/Region'].map(country_codes)
# ---------------------------------------------
# Choropleth Map 1: Confirmed COVID-19 Cases
# ---------------------------------------------
fig6 = px.choropleth(
df,
locations='iso_code', # ISO-3 country codes
color='Confirmed', # Color scale based on confirmed cases
hover_name='Country/Region', # Hover label
hover_data=['Deaths', 'Recovered', 'Death_Rate'], # Extra info on hover
color_continuous_scale='Reds',
title='Global COVID-19 Confirmed Cases Distribution'
)
# Update map layout
fig6.update_layout(
height=600,
geo=dict(
showframe=False,
showcoastlines=True,
projection_type='natural earth' # Natural Earth projection
)
)
fig6.show()
# ---------------------------------------------
# Choropleth Map 2: COVID-19 Death Rate (%)
# ---------------------------------------------
fig7 = px.choropleth(
df,
locations='iso_code',
color='Death_Rate', # Color scale based on death rate %
hover_name='Country/Region',
hover_data=['Confirmed', 'Deaths', 'Recovered'],
color_continuous_scale='Oranges',
title='Global COVID-19 Death Rate Distribution (%)'
)
# Update map layout
fig7.update_layout(
height=600,
geo=dict(
showframe=False,
showcoastlines=True,
projection_type='natural earth'
)
)
fig7.show()
Output:
Note- These choropleth maps are fully interactive, allowing zoom, pan, and hover. Here they are displayed as static images. To experience their full functionality, run them in a Jupyter Notebook, Google Colab, or Streamlit dashboard.
This interactive map uses Folium, a Python mapping library, to visualize COVID-19's global impact. Countries are represented using circle markers, where:
Here is the Code:
# Install folium if not already installed
# !pip install folium
import folium
from folium import plugins
# Create a base world map centered at lat=20, lon=0
world_map = folium.Map(location=[20, 0], zoom_start=2, tiles='OpenStreetMap')
# Coordinates for major countries (used if latitude/longitude not in dataset)
coordinates = {
'US': [39.8283, -98.5795], 'Brazil': [-14.2350, -51.9253], 'India': [20.5937, 78.9629],
'Russia': [61.5240, 105.3188], 'Peru': [-9.1900, -75.0152], 'Chile': [-35.6751, -71.5430],
'United Kingdom': [55.3781, -3.4360], 'Iran': [32.4279, 53.6880], 'Germany': [51.1657, 10.4515],
'Turkey': [38.9637, 35.2433], 'Bangladesh': [23.6850, 90.3563], 'France': [46.6034, 1.8883],
'Saudi Arabia': [23.8859, 45.0792], 'Italy': [41.8719, 12.5674], 'Pakistan': [30.3753, 69.3451],
'Spain': [40.4637, -3.7492], 'Mexico': [23.6345, -102.5528], 'South Africa': [-30.5595, 22.9375],
'Canada': [56.1304, -106.3468], 'China': [35.8617, 104.1954]
}
# Add latitude and longitude to the DataFrame
df['lat'] = df['Country/Region'].map(lambda x: coordinates.get(x, [None, None])[0])
df['lon'] = df['Country/Region'].map(lambda x: coordinates.get(x, [None, None])[1])
# Drop countries without coordinate data
map_data = df.dropna(subset=['lat', 'lon'])
# Plot each country's data as a circle marker
for idx, row in map_data.iterrows():
# Scale marker size by confirmed cases
marker_size = min(max(row['Confirmed'] / 10000, 5), 50) # Keep between 5–50
# Choose marker color based on death rate
if row['Death_Rate'] < 2:
color = 'green'
elif row['Death_Rate'] < 5:
color = 'orange'
else:
color = 'red'
# Info popup for each country
popup_text = f"""
<b>{row['Country/Region']}</b><br>
Confirmed: {row['Confirmed']:,}<br>
Deaths: {row['Deaths']:,}<br>
Recovered: {row['Recovered']:,}<br>
Death Rate: {row['Death_Rate']:.2f}%<br>
Recovery Rate: {row['Recovery_Rate']:.2f}%<br>
WHO Region: {row['WHO Region']}
"""
# Add marker to map
folium.CircleMarker(
location=[row['lat'], row['lon']],
radius=marker_size,
popup=folium.Popup(popup_text, max_width=300),
color='black',
fillColor=color,
fillOpacity=0.7,
weight=2
).add_to(world_map)
# Custom legend HTML for the map
legend_html = '''
<div style="position: fixed;
bottom: 50px; left: 50px; width: 150px; height: 90px;
background-color: white; border:2px solid grey; z-index:9999;
font-size:14px; padding: 10px">
<p><b>COVID-19 Death Rate</b></p>
<p><i class="fa fa-circle" style="color:green"></i> < 2%</p>
<p><i class="fa fa-circle" style="color:orange"></i> 2% - 5%</p>
<p><i class="fa fa-circle" style="color:red"></i> > 5%</p>
</div>
'''
# Add legend to the map
world_map.get_root().html.add_child(folium.Element(legend_html))
# Save interactive map as an HTML file
world_map.save('covid19_world_map.html')
print("Interactive world map saved as 'covid19_world_map.html'")
# Display map inline (works in Jupyter/Colab)
world_map
Output:
This final visualization combines six key subplots into a single interactive dashboard using Plotly’s make_subplots. It gives a holistic view of the global COVID-19 situation by showing confirmed cases, regional trends, recovery vs. death rates, new case distribution, recovery rates by region, and an overall status breakdown.
Here is the code:
# Save cleaned and enriched dataset for future analysis or sharing
df.to_csv('covid19_processed_data.csv', index=False)
print("Processed data saved to 'covid19_processed_data.csv'")
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import pandas as pd
# Create a 3x2 dashboard layout
final_dashboard = make_subplots(
rows=3, cols=2,
subplot_titles=(
'Top 10 Countries - Confirmed Cases', 'Regional Distribution',
'Death Rate vs Recovery Rate', 'New Cases Distribution',
'Recovery Rate by Region', 'Case Status Distribution'
),
specs=[[{"type": "bar"}, {"type": "bar"}],
[{"type": "scatter"}, {"type": "histogram"}],
[{"type": "box"}, {"type": "pie"}]],
vertical_spacing=0.12,
horizontal_spacing=0.1
)
# --- Subplot 1: Top 10 Countries by Confirmed Cases ---
final_dashboard.add_trace(
go.Bar(
x=top_10_confirmed['Country/Region'],
y=top_10_confirmed['Confirmed'],
name='Confirmed Cases',
marker_color='red'
),
row=1, col=1
)
# --- Subplot 2: Regional Distribution ---
final_dashboard.add_trace(
go.Bar(
x=regional_data['WHO Region'],
y=regional_data['Confirmed'],
name='Regional Cases',
marker_color='blue'
),
row=1, col=2
)
# --- Subplot 3: Recovery Rate vs Death Rate Scatter Plot ---
final_dashboard.add_trace(
go.Scatter(
x=df['Recovery_Rate'],
y=df['Death_Rate'],
mode='markers',
text=df['Country/Region'],
name='Countries',
marker=dict(size=8, color='purple', opacity=0.6)
),
row=2, col=1
)
# --- Subplot 4: New Cases Histogram ---
final_dashboard.add_trace(
go.Histogram(
x=df['New cases'],
name='New Cases Distribution',
marker_color='orange',
nbinsx=30
),
row=2, col=2
)
# --- Subplot 5: Recovery Rate Box Plot by Region ---
# Prepare recovery rate data for box plot
box_data = []
regions = df['WHO Region'].unique()
for region in regions:
region_data = df[df['WHO Region'] == region]['Recovery_Rate'].dropna()
box_data.extend([(rate, region) for rate in region_data])
box_df = pd.DataFrame(box_data, columns=['Recovery_Rate', 'WHO Region'])
# Plot for top 3 regions (to keep visualization clean)
for region in regions[:3]:
region_data = box_df[box_df['WHO Region'] == region]['Recovery_Rate']
final_dashboard.add_trace(
go.Box(y=region_data, name=region, boxmean=True),
row=3, col=1
)
# --- Subplot 6: Global Case Status Pie Chart ---
total_confirmed = df['Confirmed'].sum()
total_deaths = df['Deaths'].sum()
total_recovered = df['Recovered'].sum()
total_active = df['Active'].sum()
final_dashboard.add_trace(
go.Pie(
labels=['Active', 'Recovered', 'Deaths'],
values=[total_active, total_recovered, total_deaths],
name='Global Status'
),
row=3, col=2
)
# --- Layout Settings ---
final_dashboard.update_layout(
height=1200,
showlegend=True,
title_text="COVID-19 Comprehensive Analysis Dashboard",
title_x=0.5 # Center the title
)
# Show interactive dashboard in browser or notebook
final_dashboard.show()
# Save dashboard as standalone HTML file
final_dashboard.write_html("covid19_final_dashboard.html")
print("Final dashboard saved as 'covid19_final_dashboard.html'")
Output:
This dashboard gives a complete data-driven snapshot of COVID-19 trends across countries and regions.
This COVID-19 Project gave us a comprehensive, hands-on experience in working with real-world public health data. Throughout the process, we learned how to:
From a technical perspective, this project helped us gain proficiency in several powerful tools and technologies, including:
Ultimately, we transformed raw COVID-19 data into an insightful, interactive, and visually engaging dashboard demonstrating the real-world value of data science in public health decision-making.
Unlock the power of data with our popular Data Science courses, designed to make you proficient in analytics, machine learning, and big data!
Elevate your career by learning essential Data Science skills such as statistical modeling, big data processing, predictive analytics, and SQL!
Stay informed and inspired with our popular Data Science articles, offering expert insights, trends, and practical tips for aspiring data professionals!
Colab Link-
https://colab.research.google.com/drive/1ti-gc6N5zgUFZ_hQX3CAOClYg0LY7i5N?usp=sharing
805 articles published
Rohit Sharma is the Head of Revenue & Programs (International), with over 8 years of experience in business analytics, EdTech, and program management. He holds an M.Tech from IIT Delhi and specializes...
Speak with Data Science Expert
By submitting, I accept the T&C and
Privacy Policy
Start Your Career in Data Science Today
Top Resources