Probability Puzzler 6 - Averaged averages


Posted Friday, October 6 (solutions by evening of Wednesday October 11)

The puzzle: I closely follow the careers of two minor league basball players, Derek Pedroia and Dustin Jeter. In the first half of 2011 (before the minor league all-star break), Jeter had a batting average of .250, while Pedroia batted .260. Pedroia was again better in the second half of the season: he averaged .320 to Jeter's .300.

At the end of the season, the batting title (awarded to the player with the best batting average over the whole season) was given to Jeter. How could this be, since Pedroia had a better average in both halves of the season?

Note: Batting average is calculated by dividing the number of hits a player had, by the total number of chances he had to have a hit (total number of at-bats). Since an at bat either results in one hit or no hits, the batting average is always a number between 0 and 1.

A solution: That Jeter wins overall, having lost each half, seems paradoxical, but is readily explained. If Jeter's .250 early in the season was reached in very few at bats (say, 10 hits in 40 at bats) and his .300 in the late season was reached in many at bats (say 90 hits in 300 at bats), then his batting average for the whole season is going to be much closer to .300 than .250 (in this case, it will be 110/340 or .294). If Pedroia's two averages were based on similar numbers of at-bats, then his overall average will be close to .290, the average of his two halves, and thus lower than Jeter's. For example, Pedroia may have gone 52 for 200 in his first half, and 64 for 200 in his second, giving him an overall average of 116 for 400 or .290

This is an example of Simpson's paradox, of which the Wikipedia page has a good many interesting examples (including a Major League Baseball one, involving Derek Jeter and David Justice).


Solvers: (in no particular order)

The ND football decided the winner, as follows: after fall break, we saw how many total yards of offence ND had against USC on October 22 (309), and we saw what was the remainder of that number on division by 6 (3). Kevin had the right remainder in parenthesis after his name above, and so won.