Computing with Data Seminar
Fall 2014 and Spring 2015
Materials for 2013-2014 Seminar
Steven Buechler
Department of Applied and Computational Mathematics and Statistics
University of Notre Dame
This is a set of notes, examples, data and projects for a short course in advanced R programming, aimed at graduate students and senior-level undergraduates. It is intended to help students develop the skills they'll need in their research or employment. In R there are many ways to accomplish a goal, and these notes make no attempt to be comprehensive. Often I'll simply describe one method to get a result and leave it to the student to explore alternatives.
Lecture Notes
- Introduction The general perspective of the course is given, and pointers to the main tools for generating statistical reports, namely RStudio, RMarkdown and knitr.
- Motivating example This study of height and weight with respect to gender illustrates how ggplot2 can help organize even very simple analyses.
- Vectors, factors, lists Setting a baseline of knowledge about the most fundamental R objects
- Matrices The most basic structure for doing linear algebra and storing tabular data.
- Introduction to data frames First principles about working with R's foundational structure for storing tabular data.
- Data munging An example of the manipulations of a data frame that are often required prior to analysis.
- Loops, etc Introduction to iterating over an index in R, flow control and conditional execution.
- Defining functions How to define your own functions in R.
- Functions on matrices and lists Applying a function to the rows (columns) of a matrix or components of a list.
- Lists of data frames Methodology to perform statistical tests with large numbers of variables.
- Introduction to ggplot2 Introducing ggplot2 as a better way to create graphics
- Examples of geoms Examples of setting aesthetics and the most common geoms
- Scales and themes in ggplot2 Controlling the details of the visual elements that are plotted
- Facets Plots with panels ranging over subgroups
- Split-apply methodology with plyr Introduction to the plyr package for flexibly grouping data and applying a function to the pieces
Homework
- Homework 1