Dave's Ratings and Rankings

First off, I need to acknowledge Dr. Peter Wolfe who publishes all the schedules, scores and team affiliations which I use.

The WinLos Algorithm:

Ratings are based on the win-loss record and the average rating of opponents played.
Ratings are not based on: conference affiliation, quality wins, point spread, location of games played, performance in past years, human bias.
Ratings reflect the overall performance of a team over the season, not necessarily the potential of a team to beat another team.

The Spread Algorithm:

Ratings are based on the sum of the point spread in games played and the average rating of opponents played.
Wins and losses are not factored in. (Five one point losses balance one five point win.)
Ratings are usually a better predictor of future results, a team with a rating 5 points higher than an opponent is predicted to win by 5 points.

Back in 2005, I read about the controversy regarding the mysterious 'computers' being used in the BCS rating system. Many did not like the black boxes spewing out rankings which may differ from the human polls.

Part of me agreed, how can a computer figure out the best teams in the country. But another part of me, the part that got me a minor in computer science, asked 'How can a computer figure out the best teams in the country'. Or more precisely, 'If I were to design an algorithm to rate college football teams, how would I go about it?'

I couldn't just go by record, strength of schedule must count for something, or the undefeated division III team (or high school, or pee-wee team for that matter) would be considered better than a 'FBS' team that only lost in a high profile bowl at the end of the year. But I for sure wouldn't be adding bonus points for 'quality wins' or 'strength of conference', since playing in a tough conference or playing against a quality team would already be factored into the strength of schedule.

Should away games be weighted more than home games? What about point spread, should beating a team by a lot count more than a squeaker? I had to define exactly what I wanted to measure. Am I trying to identify the 'Best' team in the country? How is 'Best' defined? Am I trying to predict future outcome, or rate past performance? In the former I might dock the rating of a team that lost its star QB to injury, but not in the latter. Eventually I decided to measure overall performance, as measured by the objective of football: win the game. Not run up the score, not win at home, just win. So I decided to base my ratings on two things: record and the average rating of the opponents. Simple as that, not tradition, not Vegas, not conference, not spread.

The resulting algorithm was ridiculously simple. Computationally intensive, but that is insignificant with modern computing. It was so simple I was (am?) sure that I was not the first to think of it. But so far I haven't found another computer rating which gives exactly the same results.

Later I was curious if it was possible to substitute cumulative point spread in for win-loss record. So I tried it and it simply worked, and the Spread Algorithm was born. I have found that Spread is quicker to learn, is less volatile and often is a better predictor of the winner of future games. But using the Spread ratings show more upsets in past games, in other words, teams with lower ratings than their opponent win more games when using Spread ratings than WinLos ratings. As of this writing (at the end of the 2011 regular season) almost 17% of games were upsets by Spread standards, versus less than 14% for WinLos.

Thank you and have a nice day.

Info on Dave's Ratings and Rankings