We’re all itching for baseball to begin. It’s the time of year when a particularly forceful sneeze from Justin Verlander might set off a firestorm of news and blog posts on the subject. “Is Verlander the best sneezer in baseball?” “Will Verlander be healthy enough to begin the season after sneezing?” “What does manager Jim Leyland think about the big sneeze?” Players are on the field for the first time and they’ll soon be playing games and we’re going to analyze every detail to death.
We’re going to soon get stats from real exhibition games and we’re going to throw out these numbers in discussions of position battles – we do this every year – and we’ll write posts predicting “big things to come” for every player that puts up eye-popping spring numbers.
The only problem is that these statistics from spring training games don’t tell us anything we don’t already know. In fact, they probably tell us less than we already know. Consider the following Microsoft Excel generated graph that compares the 2012 Tigers’ spring training batting line (as approximated by OPS) to what they did when the regular season started (30 AB spring training minimum). (Click for larger.)
As you can see from the computer generated numbers, there is some level of correlation to the data, but it’s not particularly strong. We remember that Brennan Boesch (.928 OPS), Delmon Young (1.156 OPS), and Ryan Raburn (.994 OPS) all had fantastic springs before flopping during the regular season.
Now consider this second chart that compares these same Tigers’ 2011 OPS to their 2012 OPS. (Click for larger).
That’s a much stronger correlation, you might say. This is definitive proof of nothing — we’re talking about one season and 15 guys — but is a strong anecdote in support of the notion that spring training numbers are completely meaningless.
Of course we know this intuitively. Even though 75 at-bats (basically what Boesch had as the team leader in AB’s last spring) seems like it’s becoming a large number, it pales in comparison to the 300-600 at-bats that most of these guys are going to get in a season. And since we know an entire season’s worth of data contains a fair bit of random variation, it stands to reason that a relative few plate appearances – many of which come against AAAA caliber pitching – would be even that much more unreliable.
We’re going to over-analyze the stats because that’s what we always do – and there’s really nothing else to do – but we need to bear in mind the fact that these numbers mean something between jack and squat when it comes to predicting the upcoming season.