Here’s where we get to the statistical tests and where things are going to get a little technical. The question I ask is this: do walks require power (or specifically home runs) in order to lead to runs. The specific statistical test is going to need a fair amount of lead-in, but what I’m looking for is whether or not an ‘interaction term’ between power and walks has a significant impact on scoring. I’ll be using the last 15 years worth of team level scoring data and a technique called ‘regression analysis’ using some statistical software to estimate the impact.
For a long while Bill James’ Basic Runs Created has been the benchmark for predicting the runs a team should score based on abstract offensive production. It is derived by simply multiplying on-base percentage times slugging percentage times the number of plate appearances the team gets over the course of a season. It overestimates scoring a bit, the best fit is to multiply basic RC by about 80% but if we run a linear regression on scoring it does predict a bit better than OPS (on-base plus slugging rather than on-base times slugging). However, the difference isn’t all that great. The impact of plate appearances is fairly small, something that might surprise you since teams that don’t make outs get more of them. You can get pretty good results by regressing scoring on slugging and on-base percentage independently, results that don’t really improve at all by multiplying them or regressing them on the natural log of runs (which, in effect, multiplies them anyways). Including ‘plate appearances’ as a separate variable has the possibility to contaminate the results for the impact of offensive production – for reasons that should be obvious. However, over the course of a season some teams will play more extra-inning games than others and will bat in the bottom of the ninth more than others so something has to be included to correct for that. Instead of including plate appearances, I instead used outs – since presumably outs won’t be correlated in any positive way with hitting.
Since both SLG and OBP include ‘hits’, I wanted to separate ‘hits’ from ‘walks’ and extra bases. The impact of slugging isn’t greatly changed if I divide total bases by plate appearances instead of at bats, and the impact of OBP isn’t greatly changed if I use PA in the denominator instead of most PA as is usual so for simplicity’s sake that is what I elected to do. I also figured that treating all extra bases as equal was probably not correct and that getting hit by a pitch was effectively the same as a walk, so the final variables I decided to include were hits per PA walks+HBP per PA extra bases on doubles and triples per PA, extra bases on home runs per PA, double plays per PA, steals per PA and times caught stealing per PA as well as outs and the ‘interaction term’ which is walks+HBP per PA times extra bases on HR per PA.
The results are, for the most part, what you would expect to see: steals are good, getting caught is bad and grounding into a double play is devastating. That extra base you get on a double matters even more than the extra base you get by stealing (since you’re more likely to have driven someone in) but it matters a heck of a lot less than an extra single. Walks (and getting hit) are also very good, though less good than getting hits – which, again, is logical since it’s harder to drive someone in with a walk. More outs lead to more runs because it implies more innings and more chances. The full results are on the next page, if anyone is interested.
As for the interaction term, the result is significant and positive. If you’re drawing a lot of walks, it matters a great deal if you are also hitting home runs. If I try another interaction term between hits and walks+HBP, that also gives a positive impact but doesn’t pass the statistical significance test by which we could say it almost certainly does matter as opposed to being simple noise. Ironically, the third potential interaction term – between hits and home runs has a negative impact (which, I should stress, is also statistically insignificant so the true impact may well be zero). Perhaps this is because hits show up as more important than walks without the interaction term in part because they mean more when runners are on base. Home runs clear the bases, making future hits less important than they would otherwise be.
In a narrow sense, my test is successful: walks do seem to be less meaningful without home runs. However, the scale of the impact is really not all that great – if a team with an average walk rate but very little power were to increase it’s walk rate to one very near the best of the past 15 years it would get around 15 runs less benefit out of those walks than a team with an average HR rate. However, that’s not a huge number and doesn’t do much to suggest that chasing walks is an invalid strategy for a team that can’t afford pop – the net gain from that big increase in walks would still be on the magnitude of 100 runs over a season.
My secondary result [one particularly relevant to Tigers fans] might shed a little more light on why Beane’s teams have been struggling of late – despite continuing to walk. I know that this is a bit of a shocker, but bear with me: hits matter. Yes, that’s right, batting average (that silly old-fashioned tool) is actually very relevant to a team’s success. Walks are good, extra bases are good, but more hits is still better. Beane doesn’t go after guys with high batting averages, they’re all ‘overvalued’ – so the A’s usually wind up near the bottom of the AL in that department. No big surprise then that they finish near the bottom in runs. Raising your team OBP is great, but it’s about 50% better if you do it with singles than with walks. Raising your SLG is good too, but if you do it with singles you get triple the bang compared to getting extra bases off doubles or home runs. Back in 2011, Oakland strikes out less than Detroit and yet has a team batting average 25 points lower – because, of course Beane doesn’t believe in BABIP. Maybe the Tigers aren’t building an offense the wrong way…