Due For Some Mean Regression: Part 1


Allow me to give some context before the nitty-gritty statistical mumbo jumbo: The Tigers ran away with the division last year, finishing 15 games ahead of the Cleveland Indians while the two other presumed ‘Titans of the Central’ (Minnesota and Chicago) fell apart. I keep hearing that a lot of this was due to luck: good luck on the Tigers behalf and bad luck on behalf of their foes. This is what I’ll be looking at in this piece – were the Tigers better or luckier in 2011 and can we expect the gap to narrow going into 2012 due to mean regression alone?

A couple of points – first, I’ll be looking at two separate components as far as luck/talent goes: was the guy able to play (injuries, etc…) and did he perform at or exceed the expected level. Both are – from the perspective of a fan or GM – luck, and both lend themselves to mean regression. Second, I’ll be focusing on the Minnesota Twins and the Cleveland Indians. I have argued before that the White Sox were spectacularly unlucky (though very talented) in 2011 and if left untouched would have an excellent chance to bounce back. However, all the signals coming from the south side suggest that the White Sox management/ownership is not going to give the team the chance – dealing away any players with enough remaining value to unload. I would argue that the Kansas City Royals were lucky last year, but they came into the season with no real intention to contend and will come into 2012 with no real intention to contend. With a perfect storm of veteran rebounds and prospect development, it isn’t outside the realm of possibility that the Royals could be a force to be reckoned with in the coming season. That, however, isn’t a function of their ‘expected’ win total as of now but rather it’s variance – they could just as easily wind up winning 70 games with regression from Gordon & Francoeur and the utter failure of a handful of youngsters.

This will boil down to a handful of independent but related questions: Were the Tigers lucky in 2011? Were the Twins unlucky? Were the Indians unlucky? Should we expect the playing field to be leveled in 2012 or tilted back towards the Twin Cities or the Cuyahoga?

One way to look at this is the team’s ‘Pythagorean Record’, or the win-loss record you would expect them to have based on runs scored and runs allowed. Typically teams that win an unusually large number of games outperform their pythagorean record and teams that do badly underperform – in other words the gap between a 90 win team and a 70 win team is usually less than 20 games as far as actual talent is concerned. Unsurprisingly, the Tigers did ‘overachieve’ in 2011, with 89 pythagorean wins to 95 actual wins. But – the Indians ‘overachieved’ in this sense too. Based on their runs scored and runs allowed we would expect them to have won only 75, not 80 games. The Twins were every bit as bad as their record indicated, with only 62 pythagorean wins to their 63 actual wins.

For the most part, I want to look at individual players performances so to answer the first three, I’ll be using some spreadsheets… For a brief description of what you’re going to see: for the important batters for each of the three teams we have expected plate appearances and wOBA from the 2011 Marcel projections calculated prior to the 2011 season, coupled with the player’s actual number of plate appearances and wOBA. For pitchers, we’ll have expected innings pitched and ERA coupled with actual IP and ERA. Those Marcel projections are marked with an m, so mIP is Marcel innings pitched and mERA is Marcel ERA. The last bits we have are the Bill James forecasts for 2012 courtesy of FanGraphs, marked with a b – so bERA is Bill James 2012 ERA.

Since this thing is going to be long, I’ll break it up into three pieces