**Fall 2013 and Spring 2014**

Materials for 2014-2015 Seminar

Steven Buechler

Department of Applied and Computational Mathematics and Statistics

University of Notre Dame

This is a set of notes, examples, data and projects for a short course in advanced **R** programming, aimed at graduate students and senior-level undergraduates. It is intended to help students develop the skills they'll need in their research or employment. In **R** there are many ways to accomplish a goal, and these notes make no attempt to be comprehensive. Often I'll simply describe one method to get a result and leave it to the student to explore alternatives.

- Introduction The general perspective of the course is given, and pointers to the main tools for generating statistical reports, namely
*RStudio*,*RMarkdown*and*knitr*. - Motivating example This study of height and weight with respect to gender illustrates how
*ggplot2*can help organize even very simple analyses. - Vectors, factors, lists Setting a baseline of knowledge about the most fundamental
**R**objects - Matrices The most basic structure for doing linear algebra and storing tabular data
- Data frames First principles about working with
**R**'s foundational structure for storing tabular data. - Functions How to define your own functions
- Functions on matrices Using the
`apply`

function to compute on rows and columns - Functions on lists Using
`lapply`

to apply a function to each component of a list - Split-apply example An example using
`lapply`

and the long form of a data frame to handle a large number of possible covariates - Introduction to ggplot2 Introducing ggplot2 as a better way to create graphics
- Examples of geoms Examples of setting aesthetics and the most common geoms.
- Scales and themes Using scales and themes to control visual aspects of the ggplots.
- Facets Plots with panels ranging over subgroups.
- Topics for next semester Possible topics to cover in the next set of lectures.
- Split-apply methodology with plyr Introduction to the
`plyr`

package for flexibly grouping data and applying a function to the pieces. - Baseball example Career performance for power hitters is analyzed using
`plyr`

. - Refining kNN Refinement of a machine learning application with the help of
`plyr`

- Plyr Practice Some in class practice in using
`plyr`

- Manipulating strings and text The basics for manipulating and find patterns in textual data.
- Text Mining Example An example of text mining: spam filter
- Looping and iterators, part I An introduction to the
`foreach`

and`iterators`

packages - Parallelization with
`foreach`

The parallel backend to`foreach`

. - Importing data Importing data from a variety of sources.
- Introduction to
`dplyr`

Manipulating data with the`dplyr`

package.

- lapply practice data: source("http://www3.nd.edu/~steve/computing_with_data/practice/lapply_practice.R")