Homework July, 2014

Import two data.frames we’ll use in the problems.

ht_wt_df <- read.csv("http://www3.nd.edu/~steve/computing_with_data/Data/heights_weights_genders.csv")
wisc_bc_df <- read.csv("http://www3.nd.edu/~steve/computing_with_data/Data/wisc_bc_data.csv")

Problems

  1. Compute the mean of the heights separately for the males and the females in ht_wt_df.

  2. Form a sub-data.frame sub_df1 of ht_wt_df that contains every third row of ht_wt_df; that is, row 1, 3, 6, 9, …. What are the mean and variance of the people in sub_df1? HINT: You will need the seq function.

  3. Use the str function to identify the variables (columns) in the wsic_bc_df that are numeric. (a) Form a new data.frame num_bc_df that contains only the numeric variables. (b) Compute the correlation matrix of num_bc_df. HINT: Use the cor function. Look up the help on this function to learn the correct way to use it. (c) List the variables that have a correlation > 0.90 with radius_mean.