Applying a function to rows or columns of a matrix

Frequently when working with a matrix of numbers you'll want to apply a function to each row or column of a matrix to get a sense of the distributions, for example. Following is an example.

Let's read in a matrix of gene expression values.

setwd("~/Documents/Computing with Data/7_Functions_matrices/")
load("./Data/expr_mat")
dim(expr_mat)

## [1] 22283  1269

expr_mat[1:3, 1:4]

##           GSM79114 GSM79118 GSM79119 GSM79120
## 1007_s_at    9.990    9.536   10.189    9.682
## 1053_at      4.241    5.957    5.579    5.334
## 117_at       3.406    3.615    3.573    3.741

These are expression levels of "genes" as measured by an Affymetrix microarray probe for 1269 breast cancer patients.

Compute the inter-quartile range of each gene (row)

The first method uses a for loop. It computes the values one row at a time. Before entering the for loop, define a variable to hold the results.

iqr_v2 <- numeric(nrow(expr_mat))
for (i in 1:nrow(expr_mat)) {
    iqr_v2[i] <- IQR(expr_mat[i, ])
}

apply gives a more direct way. The format is simpler and may be a little faster.

iqr_vals2 <- apply(expr_mat, 1, IQR)
iqr_vals2[1:3]

## 1007_s_at   1053_at    117_at 
##    0.8371    0.6955    0.4377

This also set the names of the returned variable.

Format of an apply

result <- apply(matrix, margin ( 1= row, 2 = column), function)

The way it works is that the rows of the matrix are used as input to the function. Remember this.

Examples of `apply` usage

We can compute the mean of the columns as

mean_col <- apply(expr_mat, 2, mean)

Very often what you want to compute about a row isn't axailable as a pre-defined function with a single input variable.

How many rows in the expression matrix have constant expression values? These are ones with max = min.

sum(apply(expr_mat, 1, function(v) {
    max(v) == min(v)
}))

## [1] 0

# None of them are constant.

When the function returns a logical you can use it to subset the matrix.

Restrict the matrix to the rows with IQR > 0.5

sub_mat <- expr_mat[apply(expr_mat, 1, function(x) {
    IQR(x) > 0.5
}), ]

Applying a function to rows or columns of a matrix

Compute the inter-quartile range of each gene (row)

Format of an apply

Examples of apply usage

Examples of `apply` usage