clock menu more-arrow no yes mobile

Filed under:

Pythagorean Expectation and the Myth of Luck, or How I Learned to Stop Worrying and Accept Northwestern

As mentioned in yesterday's INP, our blog buddies at The Only Colors went to great lengths to prove that 2010 Michigan State was in no way like 2009 Iowa, in an attempt to reason how 2011 Michigan State isn't doomed to recreate the 2010 Iowa campaign.  TOC provided plenty of evidence to prove this point in the limited context of these two teams, and we have no real arguments with their analysis.  However, given that this variance -- "luck" -- is a favorite topic of ours, we decided to dig further.  We took nine years worth of standings and points for/against in both conference and overall play -- the entirety of Big Ten data available from -- and the old Bill James pythagorean expectations formula adjusted for football (explained here)1.  And we learned that Sparty 2010 wasn't lucky like 2009 Iowa.

No, they were luckier.

The 2010 Michigan State Spartans were the most fortunate team in the last nine years of Big Ten football, with an expected win total of 8.53 against an actual win total of 11, for a pythagorean margin of +2.47.  Sparty was nearly a half-win luckier than the second-luckiest team, the 2003 Ohio State Buckeyes (11 wins against 8.99 expected victories for a +2.01), and almost a full win luckier than 2009 Iowa.  The top five in overall play (including non-conference), with a couple of familiar faces:

Rank Team Py +/-
1 2010 Michigan State +2.47
2 2003 Ohio State +2.01
3 2002 Ohio State +1.80
4 2004 Iowa +1.80
5 2009 Iowa +1.58


After '10 MSU and '03 OSU comes the 2002 Ohio State national championship squad, already widely considered one of the decade's luckiest teams based on anecdotal evidence but harmed greatly by being a 14-0 squad under the microscope of pythagorean expectation.  2002 OSU falls victim to one of the flaws of pythagorean analysis of football: Nobody's perfect.  James' formula was made for baseball, and while the best baseball team in the majors would be somewhat fortunate to win three of four games from the worst, football works with single games in an overall smaller data set.  Because OSU didn't shut out every opponent, it's expected to lose some games over time, hence the "luck" of its undefeated season.  The top five is rounded out with the completely absurd 2004 Iowa 10-win team, a squad with no running backs by mid-October that somehow rung up double-digit wins with a total margin of victory of +81, and the aforementioned 2009 Hawkeyes, who won 11 but with a larger total margin.

Of course, using the full schedule allows for statistical variance based on strength of non-conference scheduling.  If we look solely at Big Ten play, as close to a level playing field as we can get, Sparty still wins.  It's just not 2010 Sparty:

Rank Team Py +/-
1 2008 Michigan State +2.16
2 2004 Northwestern +1.77
3 2010 Michigan State +1.69
4 2004 Michigan +1.63
5 2009 Northwestern +1.53


That 2008 Spartan squad went 9-4 (6-2) despite a total margin of victory of +28 and an in-conference margin of -7.  In fact, 2008 Michigan State was one of just five teams since 2002 to post a winning record in the Big Ten despite being outscored in conference play.  And, in the least surprising finding of the study, three of those five teams were from Northwestern.  The first of those, 2004 jNWU, lost at TCU.  They lost to Arizona State.  They lost by 26 at Minnesota.  they lost at Wisconsin and Michigan.  They even lost the season-ending contest at Hawaii.  But 2004 jNWU also won five conference games by less than a touchdown, so despite being outscored by a whopping 30 points in conference play (translating to a pythagorean expected record of 5-7(3-5), the LOLcats managed to finish 6-6 (5-3).  The 2009 Northwestern squad is a mirror image of the prior team: Outscored by 21 due to three double-digit losses and five single-digit wins.  2004 Michigan didn't grossly outperform expectations overall, with only a +1.04 pythagorean margin in all games, but managed to go 7-1 in the Big Ten despite a 16-point loss to Ohio State which brought its aggregate margin of victory below 100.

After the jump, we'll look at who has been most fortunate in the aggregate, the unluckiest teams of the last nine years, and why it might not be a question of luck after all.

As for the less fortunate, it's some of the most disappointing seasons in memory, including the Dark Ages at Penn State and the mid-00's lull at Iowa.  In fact, the Hawkeyes not only have two of the luckiest seasons but two of the six unluckiest (and four of the least fortunate sixteen):

Rank Team Py +/-
99 2007 Minnesota -2.74
98 2004 Purdue -2.73
97 2004 Penn State -2.46
96 2008 Iowa -2.46
95 2003 Penn State -2.36
94 2010 Iowa -2.12
86 2005 Iowa -1.68
84 2006 Iowa -1.58


The conference-only numbers improve the 2005 campaign, but also place three Iowa teams in the bottom 10:

Rank Team Py +/-
99 2003 Penn State -2.02
98 2007 Minnesota -1.85
97 2005 Michigan State -1.80
96 2009 Indiana -1.79
95 2002 Wisconsin -1.64
93 2006 Iowa -1.54
91 2008 Iowa -1.46
90 2010 Iowa -1.44


The 2007 Minnesota and 2003 Penn State teams so prominently displayed on both lists were both epically terrible teams, with a combined four wins and one conference win between them.  In many ways, they suffer from the same problem as 2002 Ohio State: Because they weren't completely inept and occasionally scored some points, their pythagorean expected win totals are going to register as something.  But because there is a limited sample, they didn't get enough games to register more than a few wins, and so we get them skewing low.  The more interesting entry into the bottom five is 2004 Purdue, a 7-5(4-4) squad that climbed to as high as #5 in the polls before suffering through a four-game mid-season skid where they lost four key conference games by a total of 10 points.  That was the best 7-5 team I've ever seen, including 2002 Purdue, which followed the same template.  2002 Wisconsin is another interesting case study: Despite going 6-0 in non-conference (they started playing AUGUST 23 to fit them all in), the Badgers went a meager 2-6 in the conference, with four single-digit losses.  As for Iowa's three or four less-than-stellar seasons, I suppose they can be chalked up to Ferentz's ever-more-staidly determination to take the air out of the football and grind the game into a coin flip; the hallmark of teams appearing at the extremes of this list are close wins and losses.

The most interesting finding of the study: Typically, when a team has "good luck" or "bad luck," it's relatively consistent.  Take Illinois, for instance.  When I asked readers on Twitter who they thought the luckiest Big Ten team of the last nine years would be, the overwhelming response was 2007 Illinois, Ron Zook's Rose Bowl-losing band of ruffians.  However, that team stumbled into the Rose Bowl by virtue of Ohio State's backwards fall into the BCS Championship Game; they weren't a conference champion, and in fact only finished 9-3(6-2) and were only +0.46 in conference play, good for 26th overall but hardly the stuff of legend.  Where readers might have gone wrong is in assuming that Illinois' usual results are reflective of baseline variance.  In actuality, the Illini have been the Big Ten's most "unlucky" team in aggregate since 2002:

Rank Team Ov +/- Conf +/-
1 Northwestern 9.74 8.28
2 Wisconsin 3.18 0.34
3 Indiana 2.00 -2.16
4 Ohio State 1.90 1.21
5 Michigan -0.55 2.15
6 Michigan State -3.38 -0.45
7 Iowa -4.37 -1.21
8 Minnesota -4.65 -2.46
9 Purdue -7.14 -3.30
10 Penn State -8.70 -4.91
11 Illinois -9.31 -5.75


Over the past nine years, Illinois is dropping more than a game per season in variance.  They have had only two seasons with a positive overall variance: the aforementioned 2007 Rose Bowl run and, hilariously, their 2-10 2005 season, where they were expected to win just 1.31 games.  They haven't had one horrible season to skew the total, either, just a string of consistently lousy results; while the Illini don't have any of the seven worst overall seasons for bad luck, they have four of the next five, all with values of -1.83 or less.  Penn State's total, also near a game per season, isn't much better, though their totals are skewed somewhat by the "Dark Ages" seasons in 2003 and 2004; with those years removed, PSU enters Iowa territory.

The fact that most teams have such consistent "luck," when coupled with the fact that close wins and losses appear to be the strongest factor in where a team appears on the list, means this list may not be a measure of "luck," per se, but rather the simple ability to win close games.  Since such ability is presumably based in large part on things like on-field experience, efficient playcalling, and clock management, the list could be considered a measure of a coach's in-game ability.  Is it any wonder that the conference's biggest late-game buffoon and a geriatric who doesn't even wear a headset sit at the bottom of the list?  Purdue's low number, due in such large part to flameouts during the Joe Tiller era (and, to a lesser extent, the futility of post-Tiller 2008), is as much an indictment of the retired polo shirt-wearer as that run of 2004 losses.  Minnesota's late-game futility has been chronicled here, repeatedly, and didn't get any better under Brewster (though Brew usually had Minny three or four scores behind his opponent before staging a comeback to cover the spread).  Michigan State's successes under Mark "The Warden" Dantonio have nearly nullified the pythagorean hole his predecessor, the legendary John L. Smith, had dug; MSU was -5.90/-4.68 from 2002-2007, a Zook-like performance of -8.86/-7.02 when extrapolated out over nine years. 

It's also a credit to Pat Fitzgerald and the late Randy Walker at Northwestern.  Even in its worst years, jNWU has outperformed its pythagorean expectations.  In every year included in this study, Northwestern had a positive overall pythagorean margin, and in all but one the LOLcats had a positive margin in conference play.  Their gigantic lead in the chart above isn't due to one season; it's consistently winning a game to a game and a half more than expected.  Northwestern's nine seasons are all within the top 30 overall (eight are within the top 20), and all but 2002's 1-7 Big Ten mark are within the top 22 in conference play.  Once may be a random occurrence, and twice a coincidence, but if three is a trend, nine is gospel.  Northwestern wins close games.  Ditto Bret Bielema and Barry Alvarez at Wisconsin, who don't have an overall season above 1.0 but only one year below -0.2 (Bielema has registered a positive rating in every season).  And, while Jim Tressel didn't play too many single-possession games at Ohio State, he had a fantastic record in the ones he did.2

This is an Iowa blog, and so we ask: What does this mean for Kirk Ferentz?  Iowa's 4.3 games below expectations overall, and 1.2 games under expectation in the conference.  This is despite the fact that Iowa is 7-5 in non-conference games decided by one touchdown or less since 2002.  On its face, that nearly half-game a season missed looks suspiciously like a coaching problem.  However, given the wide spread between the overall deficiency and intra-conference gap is an indication of a smaller, and more fundamentally Ferentz, explanation: Iowa's ability to blow out a MACrifice or I-AA cupcake raises pythagorean expectations, expectations which are not met when Iowa drops a game to Iowa State or Arizona or Pitt.  When it comes to conference play, the blowouts are so rare and so related to the continued employment of Tim Brewster that expectations remain in line with performance.  This is not to exonerate Ferentz as a coach; he's lost far, far too many non-conference games, he's one of the least effective clock managers in big-time college football, and his ability to stay close to good teams has the flipside of allowing inferior opposition to keep games tight against Iowa.  That philosophy is why Iowa fans can brag about not losing a game by double-digits in more than three seasons, but it's also why the Hawkeyes have such wide swings in pythagorean performance.


1 -- There is a wide variety of research on the proper exponent value for college football, mostly ranging between 2 and 2.5.  We used the 2.37 exponent that Pro Football Reference suggested. Because of small sample sizes, football is inherently more difficult to pin down than baseball (which uses the pure pythagorean squared model) and basketball (which uses exponents in the teens).

2 -- Indiana's totals are interesting, but a closer look shows they simply play more close games against out-of-conference opposition than the rest of the conference; close wins over Western Kentucky count just as much as those against Michigan to a blind formula.