Computing with Data Seminar

Fall 2014 and Spring 2015

Materials for 2013-2014 Seminar

Steven Buechler
Department of Applied and Computational Mathematics and Statistics
University of Notre Dame


This is a set of notes, examples, data and projects for a short course in advanced R programming, aimed at graduate students and senior-level undergraduates. It is intended to help students develop the skills they'll need in their research or employment. In R there are many ways to accomplish a goal, and these notes make no attempt to be comprehensive. Often I'll simply describe one method to get a result and leave it to the student to explore alternatives.


Lecture Notes

  1. Introduction The general perspective of the course is given, and pointers to the main tools for generating statistical reports, namely RStudio, RMarkdown and knitr.
  2. Motivating example This study of height and weight with respect to gender illustrates how ggplot2 can help organize even very simple analyses.
  3. Vectors, factors, lists Setting a baseline of knowledge about the most fundamental R objects
  4. Matrices The most basic structure for doing linear algebra and storing tabular data.
  5. Introduction to data frames First principles about working with R's foundational structure for storing tabular data.
  6. Data munging An example of the manipulations of a data frame that are often required prior to analysis.
  7. Loops, etc Introduction to iterating over an index in R, flow control and conditional execution.
  8. Defining functions How to define your own functions in R.
  9. Functions on matrices and lists Applying a function to the rows (columns) of a matrix or components of a list.
  10. Lists of data frames Methodology to perform statistical tests with large numbers of variables.
  11. Introduction to ggplot2 Introducing ggplot2 as a better way to create graphics
  12. Examples of geoms Examples of setting aesthetics and the most common geoms
  13. Scales and themes in ggplot2 Controlling the details of the visual elements that are plotted
  14. Facets Plots with panels ranging over subgroups
  15. Split-apply methodology with plyr Introduction to the plyr package for flexibly grouping data and applying a function to the pieces

Homework

  1. Homework 1