CSE 457A Final Project Process Book

Group Members

Eric Wei, eric.wei@wustl.edu, 498209

Edward Wu, edward.w@wustl.edu, 498050

Overview and Motivation

There are a lot of people among our friends who are suffering from sleeping issues. A lack of good sleep might affect a person’s health significantly. We believe that this research might give important insights into how sleep is related to several measures of health.

Related Work

https://www.nhlbi.nih.gov/health/sleep-deprivation

https://www.cdc.gov/sleep/data-research/facts-stats/adults-sleep-facts-and-stats.html

https://www.sleepfoundation.org/how-sleep-works/sleep-facts-statistics

Questions

What are the prevalence levels of different health problems across the USA?

What are the health dimensions that positively correlate with insufficient sleep?

What are the top sleep-associated health problems in each state?

For a specific sleep-associated health problem, which states have high prevalence?

Data

Health Data (Our main data)

We collected the newest (2024) health data from the Centers for Disease Control and Prevention (CDC) official website. The data contains various health measures (including short sleep duration, i.e. insufficient sleep) with geological information (format: Point(Longitude Latitude). It also contains each county's FIPS Code that we use to map each county in this dataset to the county shape dataset.

https://data.cdc.gov/500-Cities-Places/PLACES-County-Data-GIS-Friendly-Format-2024-releas/i46a-9kgh/about_data

Map Data (For creating our map)

We used the GeoJSON data that contains the border shape info of each county to help us populate the map. This dataset also contains the county FIPS Codes so we could map it to our health dataset. It also contains another helpful feature that is the full name. Since not every county-equivalent is called a "county", we need the full name to differentiate between things like St. Louis County and St. Louis City. Also, we had to make sure that it's 2022 or newer because in 2022, Connecticut changed to county-equivalents and our health data is based on this new setup.

https://public.opendatasoft.com/explore/dataset/georef-united-states-of-america-county/information/?disjunctive.ste_code&disjunctive.ste_name&disjunctive.coty_code&disjunctive.coty_name&sort=year

We looked into Leaflet's official tutorials and found a source we could include in our code to access the state border shapes.

https://leafletjs.com/examples/choropleth/us-states.js

Exploratory Data Analysis

CDC's official website offered correlation views for data points across America. We looked at them to help us select a few health metrics we were interested in.

Design Evolution

Brainstorm:

brainstorm

Three design sketches we came up:

design 1 design 2 design 3

Our final preliminary design (showing both Night Mode & Day Mode):

design final night design final day

The final design combines elements from Designs 1 and 2, with Design 3 set aside for practicality. The page layout from Design 1 is used, emphasizing a larger map display to highlight the Day/Night Mode feature. Design 2’s approach of showing markers based on prevalence levels is also incorporated. Different chart types are used for each mode: Night Mode features a scatterplot comparing health and sleep measures, while Day Mode uses a bar chart to show prevalence levels clearly. For added social benefit, health tips for better sleep are included, providing users with insights and practical advice.

Our Milestone 1 design (showing both Night Mode & Day Mode):

milestone 1 night milestone 1 day

We tried many different color palettes to better fit our theme. We also spent time tweaking the layout of our elements to enhance aesthetics. Although we haven't made the graphs for Day Mode yet, we plan to have two graphs: allow the user to select a health problem that show the top states (high prevalence) and allow the user to click into a state to see the top health problems within that state.

Our Milestone 2 design (showing both Night Mode & Day Mode):

milestone 2 night milestone 2 day

Based on our Milestone 1 feedback, we changed the marker display of locations to actual geological borders. This change enhanced ascetics, simplicity and made our map more informative. Another crucial quality of life change was the search bar where the user could search for a county they are interested in and the map's view would automatically jump to that county. To present meaningful data on state-level, we calculated the state average of the metrics and color-coded the states. This way, the user can quickly gain insight of the prevalence levels across the USA when they initially load our website. We also created the Day Mode visualizations as we planned.

Implementation

map

The US map with each state being clickable, color-coded by the prevalence.

search

Search bar for searching and jumping to a specified county.

map closeup

Once clicked into a state, the map automatically zooms and shows each county as circles on the close-up map. The color of the county means different prevalence levels (the darker the color, the more prevalent). The color scale is explained in the legend. Clicking on a county shows details about the county (name and prevalence). Note that at max zoom level, the zoom button is greyed out to hint the user that they cannot zoom in more.

switch button

A button on the map to switch between the Night Mode (Sleep Map) and the Day Mode (Health Map). The Night Mode focuses on insufficient sleep, while the Day Mode allows exploring other metrics.

state view button

A button on the map only rendered if user enters county-level view (after either zooming in or clicking on a state). It allows user to return back to state-level view of the map.

prevalence filter

A filter that can multi-select the options to filter out states that does not match the chosen ranges. Only the matched states are displayed on the map, others get grayed out.

scatter plot

(Night Mode only) By default, the scatter plot shows the correlation of insufficient sleep prevalence and prevalence of frequent mental distress across the USA. The radius of the circles represents county population. A line and numerical coefficient help the user better understand the correlation. User can click on the info icon to see an explanation of "correlation coefficient". Users can choose different Y-axis metrics (such as depression prevalence or housing insecurity prevalence). Once selected states, the scatter plot also filters on showing the selected state(s). User can select at most 6 states. Deselect a state by clicking the X next to state name. Reset button (only rendered if any selection is made) clears all selections.

top states

(Day Mode only) This section has filter based on the selected Health Metric, only the matched metric dataset will be displayed on the map. Along with the filter, it also shows the top5 highest and lowest % states, which changes according to the selected metric

bar

(Day Mode only) Displays the prevalence levels of the metrics in horizontal bar graph, sorted based on high to low prevalence. The selection usage is similar to the scatter plot. Also displays the data for the USA by default.

Evaluation

We want to use specific questions to test whether our user can gain insights on the original questions we proposed: