06 August 2015

NL vs. AL leadoff hitters


Often while marveling at Ricky Henderson’s amazing stats, I wondered how much greater a leadoff hitter he would have been if he had spent his whole career in the National League.  He had 11,180 plate appearances in the AL but only 2,166 in the NL. In both leagues, the leadoff hitter leads off the first inning, but is not guaranteed to bat leadoff in any following inning. However, I figured that in the National League, batting after the pitcher, it’d be substantially more common that the person batting first in the order would get to lead off. The pitcher almost always makes an out, so I figured it’d be pretty common for him to make the third out (and because of situations where the eighth batter is walked to get to the pitcher, probably more common than one in three).  The eighth batter isn’t that strong in the AL, but a lot stronger than almost any NL pitcher.

I’ve been working off and on over the past two years (more off before getting tenure, more on after getting tenure) on an extremely flexible python toolkit for examining baseball games and it finally got to the state of development where I could test my findings.  I’m not ready to release the toolkit yet (it needs to be polished enough that I’m proud of it), but here’s the code I used to work:

  gc = games.GameCollection()

  gc.yearStart = 2000

  gc.yearEnd = 2014

  gc.usesDH = True

  allGames = gc.parse()

  totalPAs = 0

  totalLeadOffs = 0

  for g in allGames:

      for halfInning in g.halfInnings:

          for p in halfInning.plateAppearances:

              if p.battingOrder == 1:

                  totalPAs += 1

                  if p.plateAppearanceInInning == 1:

                      totalLeadOffs += 1

  print(totalPAs, totalLeadOffs, totalLeadOffs*100/totalPAs)


It gets a collection of games where the DH is used or not used, looks at each game, then at each half inning, then at each plate appearance. If the batter is #1, then it checks whether it’s the first appearance in the inning, then prints out the percentage of all batter #1 plate appearances which are leadoffs.  The results were surprising to me. 

       PAs  Leadoff    %
No DH 183,033 75,364 41.175
With DH 163,1781 63,451 38.885

The average difference in the percentage of leadoff plate appearances between the two leagues (accounting for interleague games) is only about 2.5%. This works out to about 15 PAs a year different for Ricky in his prime. So one hypothesis down, but many more to be investigated soon.

No comments: