Map

Explore delays by date and time for specific routes. First, select a departure airport. Then select a destination airport on the map. Now you'll be able to select a date and view the number of delayed, canceled, and on time flights on that date with that route.


Departure Airport:

Delay/Cancellation Rate

Hover over a day to see the number of delayed, cancelled, and ontime flights for this route. Note that there is no route data for October. Some smaller airports don't have route data for some days as well.

Charts

Explore flight delays by date range and airline. First, select an airline. Then select a date range. Now you can see a pie chart with the delay information for that airline within that date range.

Airlines with Least Delays/Cancels in Time Span

Process Book

Final Project - The Visualization of Flight Delays in the US

Presentation Video Demo

Team members

Zhenghan Li, li.zhenghan@wudtl.edu, 451907
Peiyun Xie, peiyun.xie@wustl.edu, 451179
Thomas Clifford, t.clifford@wustl.edu, 444248

Team Expectation Agreement

We will meet weekly during class time. We will also meet at other times as the project requires. We will communicate over a text message group and email. We will use the github reposistory as well as email to collaborate on the code. We will have regular checkups to make sure everyone is contributing a fair amount to the project. All of us will divide the work equally between ourselves, but Zhenghan will have a special focus on the data acquisition and the calendar, Peiyun will have a special focus on the map and the brush, and Thomas will have a special focus on the other charts.

Code

In Github Repo

Overview and Motivation

We chose this dataset because we all love travelling and delayed flights can inconvenience people’s travel plans. Additionally, flight delays can present a major problem for businesses that have important meetings. To limit flight cancellations and delays as much as possible, we want to visualize the flight delays and cancellations of major airlines and airports. This will give travelers and businesses a better idea of which airlines and airports to choose for their trips. Additionally, it will give users an idea of what sort of delays to expect on a certain date.

Related Work


Data Visualization of Flight Delays with Tableau

This author of this article makes several visualizations related to flight delays. His visualizations show the states with the most delays as well as the delays’ causes. He also has a line graph showing delays by month. These are interesting trends, but different ones than the ones that we are looking at.

Flight View
This tool allows you to search for your flight and see if it had a delay. It doesn’t provide a good visualization though and it doesn’t really satisfy the objectives of our project.

Average Flight Arrival Delay
This tool shows the average delay for each airport in the US on a map. It lets you filter by a specific airport. It also has other charts including a chart showing average delay by month. This visualization has more of the components that we want in our visualization than the other one. However, it doesn’t deal with specific time or airline information like we want to.

Questions

The questions we want are visualization to answer are the following. Which airline should a person take given a date range to minimize delay?

Another question we will consider is, what delay time can a person expect given the airports and a date?

One question we decided to deemphasize is, which airport should a person choose to minimize delay? We decided that maybe we will consider this if we have extra time, but normally a person knows their origin airport. This would only help in the case where there were multiple origin airports to choose from within reasonable distance.

Data

We used data from this Kaggle database. The files were .csv files so we were able to just use the build in csv function in d3 to extract the data. For the actual flight data, we split it up into 365 files, one file for each day of the year. We were able to do this because the original dataset has a month and day column. This was helpful to our project because we only have to process smaller files depending on the date selected rather than processing an enormous file of all flight data. The java file used to generate these 365 files from the original flight data is in our github repository. Our dataset was pretty clean and did not require cleanup to be used in our visualization.

Exploratory Data Analysis

Initially we explored the different airport locations, by plotting them on a map of the US. We were also able to explore through the console to see how then delays varied across different locations. We plan to further explore this as our project continues and implement the planned visuals to see how delay varies across different locations and airlines. We did realize when we were plotting these locations that airport location is a lot less flexible than time or airline, so we decided that it made the most sense to treat airport location as static to a certain extent when thinking about which visualizations to display.

Design Evolution

Below are the original designs. We originally decided to focus on designs A, B, C, F, G, and H because these seemed to be the most important to achieve are objectives of letting travelers choose airports and airlines as well as for travelers to know what length of delays to expect. We have since decided to implement only B. In addition to B, have implemented a calandar chart, a brush, a donut chart, and a radar chart. Clearly, plans can change!

The below screenshot is our first implementation where we displayed a visualization to geographically represent flights on a US continental map. We found this source helpful in making this map. The visualization shows all airports in our database as black points on the map. This visualization allows you to choose a specific date as well as a specific departure and arrival city. This is now outdated as we have changed our design since making this. Our new implementations are shown in the Implementation section.

Below is our next iteration of the project. Still not the final copy. Notice how there is a pie chart and time chart that no longer exist.

Implementation

Besides the process book tab, we currently have two other tabs. The first tab, Map, contains the following charts:

First the user selects the departure airport. Then they see on the map all the possible airports that they can fly to. Hovering over a dot reveals the name of its associated airport. Now the user can select an airport to fly to. Now they can see a calendar view for the route that reveals flight data on hover in a tooltip.

The second tab, Charts, contains the following charts:

First the user selects a date range with the brush. Then a user can select an airline for display in the radar chart by clicking on it in the donut chart. The main graph shown is the total number of flights per day (including all airlines), the total delayed, the total canceled, and the total on-time. The donut chart shows the proportion of the flights in the selected date range that are from each airline. The radar chart shows the percent of delayed, canceled, and on-time flights for the selected airline in the brushed date range. A table also displays the airlines in the brushed date range with the highest on-time rate.

Issues

Our dataset doesn't have routes for October, so the calendar does not display this information.

Evaluation

We have learned a lot about how delays can occur more often depending on the season (winter has a lot of delays) or during busy holiday weekends. We also learned that holidays themselves have less delays than the surrounding weekend. We also found out that there are less flights during the weekend than during the week. This project works very well in exploring delays over different date ranges and with different airlines. It is also good in exploring delay rates over different routes. Future work may include adding more charts and maybe offering information about delay average length and not just count.

Sources