/cdn.vox-cdn.com/uploads/chorus_image/image/52114961/HawkGraph.0.png)
Iowa football is on hiatus for a few weeks before finally heading into hibernation for the spring and summer. And in the meantime, the Hawks are playing some roundball. And because I’m too cheap to buy data from other people and too stubborn to use their metrics, I’m going to, once again, try my hand at seeing what a sentient machine can tell us about sports.
This time around, things are going to be a little bit different. Basketball has a LOT of data to pore over and everybody seems to have their own favorite metric du jour. Like with the football model, I wanted to keep this one simple enough to explain to my dad, who knows nothing about statistics but likes to sound like he knows about basketball. At the same time, I wanted it to be comprehensive enough that I wasn’t missing anything glaring and obvious. So unlike the football version of H.A.W.K.E.Y.E.S., there are a few more pieces of data to input and I’ve made some serious upgrades under the hood to make it more realistic. But I’m nowhere near perfect, and this is going to be an iterative process as I try to “teach” the analytical model as the season goes on. So please, feel free to send me a message or comment below on any suggestions or concerns you have for me going forward.
The Basic Premise
Basketball has a lot of possessions per game compared to football. So instead of looking just at the score, I wanted to try to get at the pace of play, so that with each team Iowa plays, the computer would have some idea of the number of possessions (the pace of play). Without getting too technical, I used a method known as ordinary least squares regression (OLS) to predict the number of possessions there would be in any given game. The big advantage of this is that while things like steals, blocks, rebounds, and turnovers might not directly impact the score, they do directly impact the number of possessions.
With that prediction in hand, I turned around and fed the number of predicted possessions from the computer into another model that looked at offensive and defensive efficiency to then predict the score for each team while allowing team defenses to impact offensive efficiency. For example, if Iowa forces a lot of steals, then the computer will see that and ding the opponent’s offensive efficiency to compensate.
So... Does it Work?
After Iowa had a few games under their belt, I was able to test this out to see how well it predicted scores. I tested it against the Notre Dame game, which had a score of 92-78.
The computer predicted a score of 87-72 in favor of Notre Dame. So, this checks off two boxes. The first is the eye-test. The computer didn’t spit out a score line like 357-1995, or something else ridiculously implausible. The second is that it correctly predicted the winner. The third is a bit more subjective, as it did predict both scores a bit conservatively. I happen to think that a 5-6 point difference this early in the season with only a handful of games played is more than acceptable. However, I’d like to get it a bit closer. So let’s call it 2.5 out of 3 criteria fulfilled.
What about Nebraska-Omaha?
Iowa’s next game is Saturday against the University of Nebraska-Omaha. The Mavericks are ranked 182nd in RPI, which sounds like easy pickings for Iowa. Unfortunately, Iowa currently is only 180th in the RPI rankings currently, so according one of the more respected basketball ranking indices, this game is actually pretty much a tie. Knowing that, I am much less surprised at H.A.W.K.E.Y.E.S. outlook on the game. The model predicted a straight up tie 82-82. If you really dig into the number, it looks like the computer gives the slight edge to the Hawkeyes (by an estimated three quarters of a point). In other words, expect a really close game that will be ultimately decided in the last minute or two of the game, possibly overtime. I hesitate to say that we’re predicting and Iowa win, here, but we are slightly favored in this model.
Five Factors to Pay Attention To
:no_upscale()/cdn.vox-cdn.com/uploads/chorus_asset/file/7577247/school1.png)
The bright side is that Iowa looks to be much more dominant offensively than the Mavericks, besting them in assists, offensive boards, and three point production per game, which are big keys for Iowa wins. However, Nebraska-Omaha is a bit stingier on defense than Iowa, but not by much. Considering how close the game is likely to be, I’d say these factors push the advantage to Iowa.