**Fall 2013 and Spring 2014**

Materials for 2014-2015 Seminar

Steven Buechler

Department of Applied and Computational Mathematics and Statistics

University of Notre Dame

This is a set of notes, examples, data and projects for a short course in advanced **R** programming, aimed at graduate students and senior-level undergraduates. It is intended to help students develop the skills they'll need in their research or employment. In **R** there are many ways to accomplish a goal, and these notes make no attempt to be comprehensive. Often I'll simply describe one method to get a result and leave it to the student to explore alternatives.

- Introduction The general perspective of the course is given, and pointers to the main tools for generating statistical reports, namely *RStudio*, *RMarkdown* and *knitr*.
- Motivating example This study of height and weight with respect to gender illustrates how *ggplot2* can help organize even very simple analyses.
- Vectors, factors, lists Setting a baseline of knowledge about the most fundamental **R** objects
- Matrices The most basic structure for doing linear algebra and storing tabular data
- Data frames First principles about working with
- Functions How to define your own functions
- Functions on matrices Using the
`apply`

- Functions on lists Using `lapply` to apply a function to each component of a list
`lapply`

- Split-apply example An example using `lapply` and the long form of a data frame to handle a large number of possible covariates
`lapply`

and the long form of a data frame to handle a large number of possible covariates - Introduction to ggplot2 Introducing ggplot2 as a better way to create graphics
- Examples of geoms Examples of setting aesthetics and the most common geoms.
- Scales and themes Using scales and themes to control visual aspects of the ggplots.
- Facets Plots with panels ranging over subgroups.
- Topics for next semester Possible topics to cover in the next set of lectures.
- Split-apply methodology with plyr Introduction to the
`plyr`

- Baseball example Career performance for power hitters is analyzed using `plyr`.
`plyr`

- Refining kNN Refinement of a machine learning application with the help of `plyr`
`plyr`

- Plyr Practice Some in class practice in using `plyr`
`plyr`

- Manipulating strings and text The basics for manipulating and find patterns in textual data.
- Text Mining Example An example of text mining: spam filter
- Looping and iterators, part I An introduction to the
`foreach`

and`iterators`

packages - Parallelization with
`foreach`

The parallel backend to`foreach`

. - Importing data Importing data from a variety of sources.
- Introduction to
`dplyr`

Manipulating data with the`dplyr`

package.

- lapply practice data: source("http://www3.nd.edu/~steve/computing_with_data/practice/lapply_practice.R")