Student final projects

Project Number	Project title
1	The Study of the Freshman 15
2	Economic Success vs. World Cup Success
3	Child Maltreatment in Indiana
4	CO2 Emissions
5	College Football Defenses
6	College Data Analysis
7	Effects of the COVID-19 Pandemic: Comparing the effects of the COVID-19 pandemic in New York City and Chicago
8	Happiness in Europe
9	Sports Injuries
10	Global CO2 Emissions
11	Chicago Crime
12	NFL Penalties
13	Income data stratified by social media presence
14	Global Education Systems
15	Is Cereal Actually Healthy?!
16	GDP's Effect on Countries Around the World
17	Movie Analysis

01 - The Study of the Freshman 15

Tucker Lawrence, Obed Antwi-Baidoo

social, education

Our initial project data was a CSV file with 4 columns. We then used the BMI equation of BMI = weight / height^2 to solve for the initial and final heights for each participant in the study. With this, we could solve for the height change and weight change of each person using a for loop, and the ratio of weight change to height change by dividing the weight change by the height change for each person. At the end, we had 12 columns in our data frame by which we could extract information and make our visualizations.

02 - Economic Success vs. World Cup Success

Jack Bailey, Jake Zywiec

financial, sports, international

We started by gathering CSV files that contained data we would need. Then we took specific data and converted them into individual lists. We converted those lists into dataframes that we used in our visualizations. We took the HTML files of those visualizations from Google Colab and added them into our website. For example, one CSV file contained the outcomes of every game from the 2018 World Cup. We manually made CSV files for the unemployment, GDP per Capita, and number of wins. We made a dataframe using the unemployment and number of wins data, then used said dataframe to make a visualization.

03 - Child Maltreatment in Indiana

Kenna Bonde, Pamela Greer

social, justice

The data (which consisted of county name, year, boy maltreatment, girl maltreatment, the county population in total, child maltreatment rate, boy maltreatment rate /1000, and girl maltreatment rate /1000) was already collected and recorded in a table for every county of Indiana from the year 2013 to 2017. Due to this, we imported the csv file and created a table to make the data more organized. After this, we decided to create different types of visualizations to analyze the data and gain insights from each visualization.

Video link here

demo-video from Pamela Greer on Vimeo.

04 - CO2 Emissions

Maxwell Feldmann, Colter Niezgodzki

international

We organized data on CO2 emissions by country with other covariates to find correlations. We downloaded the csv data, aggregated the data into one google spreadsheet, downloaded the google spreadsheet (and a limited version of said spreadsheet), converted everything to lists, processed said lists to create visualizations, drew insights from said visualizations, and then finally presented everything in a sleak, modern style using bootstrapr.io.

05 - College Football Defenses

Makenna Broyles, Maria Go

sports

We downloaded multiple CSV files from a college football database.Then in python, we were able to convert the CSV data into succinct insights on multiple conferences, teams, and players using full lists and then smaller subsets of lists for each football conference. We divided our insights into data categories including interceptions, sacks, and tackles, then divided those categories into conferences. Within each data insight, the x-axis shows the team, and the bars are stacked by player. We were able to take a lot of data that did not have much meaning, clean it, and turn it into multiple different meaningful visualizations.

06 - College Data Analysis

Paul Buellesbach, Sean O'Gara

education

We began with a large amount of data that contained a list of colleges as well as statistics associated with that college. From there, we had to eliminate the data that was incomplete and then make the data useable by cleaning it. After that, we were able to make our graphs and visualizations. We made bar charts, pie charts, and scatterplots showing a wide range of topics including admissions, diversity, and financial aid.

07 - Effects of the COVID-19 Pandemic: Comparing the effects of the COVID-19 pandemic in New York City and Chicago

Andrew Oelschlager, Colin Chalk

public health

The cities of New York and Chicago provide easy to access, data relating to the COVID-19 pandemic that is updated daily. Once we found our data sources we picked which metrics (i.e. daily cases, estimated number of cases, 7 day average of cases, hospitalizations, deaths, hospitalizations by race, etc.) would be most useful. We then cleaned the data into an easy to work with form and created plots that can help visualize the overwhelming amount of information.

08 - Happiness in Europe

Matt Prame, Weizhen Yuan

social

Matt and Weizhen sought to determine how different factors (including average income, geographic region, and average time spent doing different activities) correlated with levels of happiness in Europe, using quantitative and survey data to analyze these correlations. They plotted happiness by European geographic region and happiness by GDP level, and they showed which activities (and the amount of time) that the very happiest and saddest countries spend the most time doing.

09 - Sports Injuries

Daniel Durkin, Josue Rocha

sports

During our initial search for potential data to use, we looked through many different injury reports from different sources. Most of the injury reports were very complex and specific in terms of the injury description and recovery process, which we knew would be too complicated in terms of comparing between leagues. However, we were able to find injury reports that were more broad in terms of type of injury, which we knew would cater more to our goal of comparing trends between leagues. We put the 3 injury reports into an excel sheet and downloaded them as csv files, to which we then downloaded to colab. We decided our main goals to be comparing the amount of injuries in general, the amount of injuries per position, and the amount of different types of injuries between and within leagues. Therefore, we seperated the data into a player list, a position list, and a type list. We then further separated the position list into lists for each position, and further separated the type list into lists for each body part injured. After this, we had the basis of our information ready for visualizations.

10 - Global CO2 Emissions

Dagny Brand, Angela Santos

international, environmental

To transform the data to information, we downloaded the CSV file and read its contents to create two data frames, one of data over years and one of data per country. We used these data frames to create different visualizations, including a world map with colors changing depending on emissions per capita, a timeline for the user's chosen country of total fuel used, and a bar graph for different types of fuel use per country. These visualizations changed the data into information showing how bigger countries tend to have higher CO2 emissions per capita and how many countries have been using more fuel as time has gone on.

11 - Chicago crime

Demetrios Fotopoulos, Daymine Snow

crime

Using csv files and API information that we accessed on the internet, we were able to convert content about crime statistics in Chicago into organized data frames and dictionaries in the Python coding language. From there, we took this information and put it into beautiful graphics, which we then put onto an HTML file that we created to display our graphics. It was a lot of fun for us and we felt like we learned a lot.

12 - NFL Penalties

Matthew O'Donnell, Emanuel Telles Chaves

sports, entertainment

After watching the Dallas Cowboys versus Las Vegas Raiders game this Thanksgiving, which resulted in the Raiders narrowly defeating the Cowboys in a game plagued by players on both teams committing penalty after penalty (a few of which later deemed fineable penalties by the National Football League), we wanted to examine if we could determine any patterns in fineable penalties to help NFL teams and to intrigue NFL aficionados. After finding our data sources and cleaning said data, we examined several different factors—including a player's age and position, potential trends amongst teams by their conference and division, and prospective patterns amongst repeated offenders by fines and days suspended. To this end, we attempted to answer three questions through our data inspection: (1) Does a player's age and/or position impact their likelihood to commit penalties?, (2) Do certain conferences/divisions tend to commit more penalties than others, and (3) Does the NFL do a good job upholding their policy of dissuading "repeated offenders?". Upon finding answers to these questions in our data, we created 7 visualizations that expose trends in NFL's penalties.

13 - Income data stratified by social media presence

Christian Trzeciak, John McDonough

social, financial, entertainment

We collected data from multiple sources around the internet to represent the incomes of the world's highest paid musical artists, their different revenue streams, and their number of social media followers. We converted that data to data frames and graphs using python code, then organized those visualizations in an html webpage.

14 - Global Education Systems

Arden Jennings, Marcus Espeland

international, education

Our project is based on data from the UN, specifically their collection of data on the number of students, number of teachers, and total education budget as a percent of GDP for various countries around the world. We started with very lengthy spreadsheets full of data and spent a lot of time cutting it down to only include the countries and years that were consistent across the three sets of data. Next, we wrote a code using pandas and plotly.express and imported the data into the code so that it would output information in the form of various visualizations. Lastly, we created a web page that would display this information in a way that is user friendly and steam lined to maximize the amount one could learn from our data.

15 - Is Cereal Actually Healthy?!

Kailee David, Temitope Kassim

health

We focused our project on the data of cereals and their nutritional values. We looked online for an API (an accessible data source) that would display cereals’ nutritional values and thus help us really determine which cereals are best for human health. Through this api file, we converted it to lists that were later converted into dataframes. By doing these steps, we were able to use the data found within the cereal API to create visual graphs that compared many important nutritional aspects of cereal. Our graphs ranged graphing cereals’ data: sugars vs. calories vs. fats, potassium vs. fiber count, brands vs. ratings, and lastly carbs vs. proteins. Through these graphs we were able to truly determine the ‘best’ cereal for consumers and how this term can vary based on the groups looking at the graphs.

16 - GDP's Effect on Countries Around the World

Catherine Keele, Jaeyoun Kim

international

We took 72 countries' GDPs throughout the span of a decade with 3 years of data (2010,2015,2020) as well as variables such as fertility and mortality rates, as well as life expectancy overall, but also for both males and females. We used these variables and graphed them to see if there was any relationship or correlation between them and what might be the cause behind that.

17 - Movie Analysis

James Wade, Roman Sally, Derek Lee

entertainment

We started our data to information to journey by finding a csv file on movie data from the 20th century from the Project Data document. Based off this, we had to sort through all 1600+ entries in the file to ensure that entries were not missing important data such as actor names, subjects, etc. This was certainly a time consuming process, as after cleaning our data, we had to import all of the data from the csv to a DataFrame. Based off our DataFrame, we had many insights to use. For example, we plotted movie runtime lengths to their popularity, determined the popularity of specific directors, and found information pertaining to specific genres of movies. Overall, while it was a time consuming process, it allowed for us to gleam insights as to what topics or directors the movie industry might have favored during the 20th century.

Student Projects

Tucker Lawrence, Obed Antwi-Baidoo

social, education

Jack Bailey, Jake Zywiec

financial, sports, international

Kenna Bonde, Pamela Greer

social, justice

Maxwell Feldmann, Colter Niezgodzki

international

Makenna Broyles, Maria Go

sports

Paul Buellesbach, Sean O'Gara

education

Andrew Oelschlager, Colin Chalk

public health

Matt Prame, Weizhen Yuan

social

Daniel Durkin, Josue Rocha

sports

Dagny Brand, Angela Santos

international, environmental

Demetrios Fotopoulos, Daymine Snow

crime

Matthew O'Donnell, Emanuel Telles Chaves

sports, entertainment

Christian Trzeciak, John McDonough

social, financial, entertainment

Arden Jennings, Marcus Espeland

international, education

Kailee David, Temitope Kassim

health

Catherine Keele, Jaeyoun Kim

international

James Wade, Roman Sally, Derek Lee

entertainment