Disasters and Vulnerability

Basic Info

Title: Disasters and Vulnerability

Names: Barrett Lin, Hengchang Lu

Email: b.a.lin@wustl.edu, hengchang@wustl.edu

ID: 508753, 508777

Background and Motivation

Barrett: Natural disasters have always been an interesting concept to me as they simply happen with or without human interaction. While I haven't done much actual research into the effects they have on place, I can always remember seeing horrific details on the news both on the cost of human life as well as disaster relief efforts. I remember when I was in high school a snowstorm hit where I lived in Texas, and, without the proper infrastructure to handle such cold, people were without clean water and electricity for weeks. The key motivation for this project comes from looking at how natural disasters have an unequal impact on locations based on their preparedness for such disasters, seen through general variables like GDP and population density, as well as more specific variables like relevant infrastructure. The hope for this project is that it reveals important insights that can be acted on in order to mitigate the costs of natural disasters in places affected the most.

Project Objectives

The primary questions we are attempting to answer with our visualization are:

Through answering these questions, we want to learn the relationship between natural disasters and the aforementioned variables. We want to learn these aspects so that recovery efforts can be more targeted, and the places that desperately need help receive it and can develop the tools to prevent future disaster. Some benefits that could come from this:

Data

We have found a lot of geospatial dataset on kaggle that are useful for our project. The core dataset is the global earthquake - tsunami risk assessment dataset (https://www.kaggle.com/datasets/ahmeduzaki/global-earthquake-tsunami-risk-assessment-dataset) containing geospatial data for over 100 historical earthquake events from 1900 to 2020 with a focus on tsunami generation potential. It includes key features like earthquake magnitude, focal depth, and epicenter coordinates. Data quality is especially robust for post-1960 events due to improved monitoring.

The world GDP dataset (https://www.kaggle.com/datasets/zgrcemta/world-gdpgdp-gdp-per-capita-and-annual-growths), based on World Bank data, is included as a core dataset to assess the socioeconomic vulnerability of countries to natural disasters like earthquakes and tsunamis. It provides annual metrics from 1960 onward. This dataset is crucial because GDP significantly influences a country's ability to prepare for, withstand, and recover from severe natural disasters. Lower GDP often correlates with weaker infrastructure, limited emergency resources, and high casualty rates. We can use this to analyze how economic status shapes disaster impact and recovery speed.

The EMDAT (Emergency Events Database) dataset (https://www.emdat.be/), maintained by the Centre for Research on the Epidemiology of Disasters (CRED) provides comprehensive data on natural disasters worldwide. This dataset includes information on fatalities, economic damage in USD, and disaster types by country and year. We use this data to create bar charts showing countries with the highest GDP loss (total and percentage), as well as the most afflicted countries by disaster count.

To enhance our analysis of how earthquakes and tsunamis impact a region's key infrastructure, we incorporate more geospatial datasets on critical assets near epicenters or coastlines. This allows us to model risks based on proximity. The dataset are:

Data Processing

We will first organize key information for our raw data. The global earthquake tsunami dataset includes earth magnitude, focal depth, epicenter coordinates, and tsunami generation potential. World gdp dataset contains annual metrics like total national/regional GDP, GDP per capita, and growth rates. Infrastructure datasets comprise global aviation hubs, world ports, powerplants, and nuclear power plants

We need to convert all datasets to compatible base formats like GeoJSON. We need to also unify the coordinate system.

Handle missing values. We need to label missing records in the earthquake-tsunami data before 1960. For GDP data, we need to address annual missing values in small countries and regions. For missing key infrastructure attributes, we can supplement information via official databases.

We also need to remove some unreasonable records in the earthquake data. For example, micro-earthquakes with magnitude < 2.0 can barely be felt and are excluded from analysis. Also we should remove extreme outliers in the GDP data. Should there be sharp annual declines or increases caused by wars or pandemics, we want to store them separately.

Then we link scattered datasets by the dimensions of geography and time to form a comprehensive dataset usable for visualization. The core correlation logics are: 1. Spatial correlation. Based on latitude and longitude, we need to calculate straight line distance between airports, ports, powerplants and epicenters to determine if they fall within disaster impact zones. We need to also add a temporal correlation link that binds the time at which a disaster takes place and the socioeconomic situation of this region at the same time.

Eventually, we need two fused datasets. One is the disaster-economy table where each row includes year, country, earthquake magnitude, tsunami potential, total GDP, and GDP per capita. The other one is the disaster-infrastructure table where each row includes year, country, disaster type, affected infrastructure type, number of facilities.

Visualization Design

Visualization Design 1

Figure 1

Visualization Design 2

Figure 2

Visualization Design 3

Figure 3

Visualization Design 4

Figure 4

Visualization Design 5

Figure 5

Milestone 1

By Milestone 1, we completed the initial data processing and developed a basic interactive visualization framework, showing just Earthquake and Tsunami data on a map.

Milestone 2

As of Milestone 2, we have finalized all must-have features from Milestone 1 and made progress on additional features and optimizations.

Completed Features

Final Product

The final visualization is a comprehensive interactive dashboard that successfully integrates multiple datasets to explore the relationship between natural disasters, economic factors, and infrastructure. The visualization provides users with powerful tools to analyze disaster impacts across time and geography.

Key Features

Technical Implementation

Insights Enabled

The final visualization enables users to explore critical questions about disaster vulnerability:

Final Product Screenshot

Final Product - Main Dashboard

Evaluation

Throughout the visualizations of our initial data, it became clear it wouldn't be sufficient to answer our questions. GDP data is too large for the sometimes low impact natural disasters have, so using EMDAT data was necessary in order to show their disproportionate impact. Splitting up the overall data in the interactable map still shows all the aspects we want to show, and the EMDAT data allows us to better show the impact of natural disasters on a country's economy. Better infrastrucutre data (exact dates of creation and destruction) would better show the impact of natural disasters, but this data was not able to be found and integrated. Future improvements would include adding more specific data within each country (both infrastructure and wealth) to show more exact and specific impact.

Project Schedule

10/26-11/1 (Proposal Due 10/27)
11/2-11/8
11/9-15 (Milestone 1 Due 11/10)
11/16-11/22
11/23-11/29 (Milestone 2 Due 11/24)
11/30-12/8 (Final Project Presentations Due 11/30, 12/8 Final Due Date)