30 May 2008

Are bullpens underused?

This post was mostly written during Spring Training, so 2007 figures are used throughout. Life prevented posting until now.

Back in the "good old days" of baseball, bullpens were nearly non-existent. Two-man rotations were common, 600 inning-pitched seasons were possible, and one man pitched every inning of an entire major league season (Wondering who? See below). Since then, we've created four-man rotations, dedicated bullpens, closers, five-man rotations, long relievers, setup men, and, coming soon to a ballpark near you, the seventh-inning specialist. And throughout all thus, grumpy old men--along with grumpy young men, grumpy old women, grumpy young women, and grumpy transexuals of all ages--have decried the changes as a weakening of the quality of starting pitching.

But is it possible that everyone is wrong? Could the problems with pitching be traced to an underuse of those 6 to 8 "guns" not considered durable enough to start a game? Let's consider some baseball axioms before we look further. I like math, so I'll use some weird symbols, but try to explain them.

Axiom 1: ∂(ERA)/∂(IP) > 0

That is just to say that we expect ERA to rise for every additional inning pitched. I'll admit that this axiom might not hold for a former AA-pitcher getting his first few innings in at the major league level, or for someone coming back just off the disabled list. But for the most part, it's hard to disagree with, whether we're dealing with innings pitched within a game or innings pitched over a season.

Axiom 2: ERA(closer) < ERA(ace) ; ERA(setup man) < ERA(#2 starter) ; etc...

The ERA of your relief squad tends to be lower than the starting pitchers. Or to put it in another way, if you only had to put your best pitcher in for one inning of work, he would almost certainly come from the bullpen. Compare the best reliever to the best starter on almost every club and you'll see that the best reliever comes out ahead. Usually even the best two or three relievers have better ERAs than aces--this holds true for great teams and miserable ones. In 2007, the Detroit tigers had three relievers log more than 40 innings with a lower ERA than their ace, Justin Verlander, and two more with a lower ERA than their second best pitcher to log at least 15 starts. The Padres (the team I know best): Even a triple-crown winning Cy Young pitcher (Peavy) had an ERA bested by a setup man, Heath Bell (with 93 2/3 IP, not a small sample); and four more relief pitchers (Hoffman, Brocail, K. Cameron, and Justin Hampson) bested Chris Young, their second best pitcher. We see the same patterns even for lousy teams: four Royals relievers outpitched Gil Meche's 3.67 ERA.

Compare the ERAs of bullpens as of mid-2007 to this analysis of starters at the end of the season. The average bullpen is about as good as the average no. 2 starter! In other words, it does not really matter if by the sixth inning your starter is still "feeling good." Unless he's your ace, or throwing a shutout or no hitter, or your bullpen is absolutely drained from a recent 18-inning game, it's time to call in some new arms. Do so and your expected chance of winning just went up.

All else being equal, every inning that you have someone on the mound with a higher ERA than someone else you could put out there is an inning of poor managing. All else being equal, bullpens should be used more until their ERAs rise to meet that of the starters.

Now here's the argument I'm ready to hear: all else is not equal. Not every inning is as important as every other. That's definitely true! The best relievers pitch in the most important situations: with the game close, and a win on the line if only he can not allow any runs. So it makes sense that you want some of your lowest ERA-men pitching then. But when do starters begin pitching? They begin with the game tied, where any run allowed or not makes a huge difference in the probability of winning or losing the game -- nearly as important a situation as what setup men and closers pitch in. But there are several members of the bullpen who tend to pitch in less important innings; that is, blowouts in either direction. So if anything, the ERAs of bullpens should be substantially higher than starters. That they are not, shows that the bullpens are being over-rested and under used.

Trivia answer: Jim Devlin of the 1877 Louisville Grays pitched every inning in a 61 game season, compiling 559 innings pitched and allowing a total of 4 HRs. Though his ERA was a very good 2.25 (146 ERA+), it says something about the way official scorers have changed over the years: though he allowed only 140 earned runs, he allowed a total of 288 runs, or 8 more unearned runs than earned.

(Before I'm accused of copying this trivia information from Wikipedia, be sure to check who added it there in the first place).

Context and baseball statistics

From this week's MLB Power Rankings on ESPN.com:
In May, Hideki Matsui has more multi-hit games (nine) than he has games in which he hasn't gotten a hit (five)

We're supposed to take this as to mean that Hideki Matsui is doing great this month. But what does it actually mean? How many hitters would we expect to have more multi-hit games than no-hit games? 1%? 10%? Stats like this out of context drive me crazy. It's pretty easy to figure it out at least for simple cases.

The probability of not getting a hit in any at bat is (1 - Batting Average), so the probability of going 4 at bats without a hit is (1-BA) to the 4th power. Here's a little program (in Python) for figuring this out:

def noHit(BA):
return (1-BA)**4

>>> noHit(.250): 0.32
>>> noHit(.300): 0.24
>>> noHit(.400): 0.13

So we can see that for even an average player, a no hit game happens only 1 in 3 games, and for a good batter, or a good batter on a real roll, these things happen rather seldom. What about Multi-Hit Games? If you have four at bats per game, there are 11 different ways of getting two or more hits (one way of getting four hits, 4 of 3, and 6 of 2). In the chart below, x = hit, and o = not hit:

xxxx = four hits

oxxx = three hits

xxoo = two hits

xooo = one hit

oooo = no hits

(For those interested, the number of ways of getting 4, 3, 2, 1, 0 hits, that is, 1, 4, 6, 4, 1 is the fourth row of Pascal's triangle). So one way of calculating the probability of multi-hit game is to find the probability of a single hit game (4*BA*(1-BA)^3) add to it the probability of a no-hit game, and subtract it from 1:

def multiHit(BA):
return 1-(noHit(BA) + 4*(BA*(1-BA)**3))

>>>multiHit(.250): 0.26
>>>multiHit(.300): 0.35
>>>multiHit(.400): 0.52

So as long as you're getting 4 ABs per game, you don't really need to be a great hitter to expect to get more multi-hit games than no-hit games. In fact, a BA of just .267 will do it for you. Returning to the original post, we see that Matsui is doing better than 1:1 in May, getting a 9:5 Multi:No ratio. How good do you have to be to get that? A .320 BA will suffice. Matsui's BA has actually been slightly better than that in May, .337, but he's also been getting just under 4 ABs a game (3.83).

4 ABs a game (or even 3.8) is really only manageable if you're not having many plate appearances that don't count for at bats -- in other words, if you're not walking much. In April, Matsui was averaging only 3.1 ABs per game. This makes it much harder to have so many multi-hit games: if you have 3ABs per game, you need to be batting above .348 to get more more multi-hit games than no-hits, and have a whopping .411 to have Matsui's ratio.

Looking closely at the numbers, there's much less to cheer about for fans of Godzilla: the rise in multi-hit games came almost entirely from a drop in walks (from 12 to 7), resulting in a OBP 40 points lower in May than in April. He did not make up the difference in SLG either, dropping 50 points there. So upon closer inspection, the numbers tell an entirely different story: in everything but BA Matsui had a May that was worse than April and below his career levels.