Tuesday, November 25, 2008

Why Ranking Violations are a Flawed Metric

I hate to ever criticize anything without being prepared to offer an alternative. So, I'll let you know up front, I will offer an alternative (at the end of this post).

But first, for those who might have no idea what "ranking violations" are, here is a very brief tutorial...

Let's say John Doe has made his own football rankings. Is there an easy way to see if they make sense? A popular approach is to calculate the frequency of "ranking violations." A ranking violation occurs when a loser in a played game is ranked higher than a winner. Now why, might you ask, would any rankings ever do that? The answer is, once you're about halfway into the season, there's no way around it. There is simply no way to rank teams such that winners are always ranked above losers. Eventually, some 2-5 team beats a 5-2 team and making the ranking violations go away becomes impossible. If you would like to see some ranking violation stats in action, check out Ken Massey's College Football Ranking Comparison page (scroll to the bottom of http://www.mratings.com/cf/compare.htm).

OK, enough on what ranking violations are. If you're still unclear, google it. Next...

Now, if we can't make ranking violations go away, then it would seem to make sense that we rank teams to keep them at a minimum, right? That way, we don't have to listen to folks invoke the "head to head" argument. I think I preached on that in another post, so I won't go down that road here. The short answer to should we minimize ranking violations is... "No."

So, I've made the beginnings of an argument in support of minimizing ranking violations and now I'm suggesting it's a bad idea. Why? The reason is that it's almost, but not quite, the best metric. The problem is a little complicated, so bear with me.

Let's take a sample problem. It's not terribly realistic, but it's been designed to make a point. We have three teams in a conference -- A, B, and C. Teams A and B play each other ten times during the regular season and A wins every time. I know this wouldn't happen in the real world, I'm only making the point that A is clearly better than B. If you have a problem with this, then the alternative is that A and B play a common set of opponents. Team A wins all of their games and B loses all of theirs. Better? OK, now introduce team C. C plays two games, beating team A and losing to team B.

Now time to rank the teams. Obviously, we rank A ahead of B. But what about C. We can minimize the ranking violations by ranking them above A (first in the conference) or below B (last in the conference). Strange, our minimum ranking violations approach has clearly shown us that team C is probably either the best or the worst in their conference, but probably not in the middle. If this makes sense to you, then quit now -- there's no hope. Otherwise, read on...

OK, it would seem reasonable (both subjectively and from a "maximum likelihood" viewpoint -- we won't dive into the math on that here), that team C probably belongs between A and B, but how can we express that mathematically. The solution I propose is an alternative to ranking violations that I've dubbed "record violations" (I have also referred to it as schedule violations). It goes like this...

Team C's record is 1-1. By ranking them between A and B, one opponent is ranked higher (what I'll call the "higher") and one is ranked lower (the "lower"). Thus, their lower/higher is 1-1. Because their W/L (win-loss) matches their L/H (lower/higher), they have zero record violations.* You can check out our L/H numbers on our Atomic Football ranking page.

I first proposed this metric to Ken Massey in late 2006, and I'm hopeful he will find the time to add it to his comparison page. Here is the text from my original message:

-------------

Ken,

I wanted to suggest a variant on the ranking violation metric.

Consider a team that has beaten #13, #15, #17, and #19 and lost to #1, #3, #5, and #6**. In addition the team has beaten #9 and #11. Being 5-5 against teams of average rank #10, 1-4 against teams ranked #1-#9 and 4-1 against teams ranked #11-#20, it would seem reasonable to rank this team #10.

However, doing so yields two ranking violations. One of the violations could be alleviated by moving the team up to #8 or down to #12. This is obviously a counterintuitive situation (and one I discussed in my recent paper). Now consider an alternative metric.

If we retain the #10 ranking, then this hypothetical team is 5-5 against 5 teams that are ranked higher and 5 teams that are ranked lower.

Thus if Wins-Losses is the same as Lower-Higher (lower being the number of teams*** ranked lower and higher being the number ranked higher), then we would say that we we have zero "Record Violations" (if you have an alternative name, please let me know). In other words, with this metric we will allow a ranking violation corresponding to a win against a higher ranked team to cancel a ranking violation corresponding to a loss against a lower ranked team. Thus, for this team we find:

Rank RankingViolations RecordViolations
#2......4......4
#4......3......3
#6......2......2
#8......1......1
#10.....2......0
#12.....1......1
#14.....2......2
#16.....3......3
#18.....4......4

As you can see, ranking violations have two local optima, whereas record violations do not.

To put things on the same percentage scale as our traditional ranking violation [sic], we will continue to normalize by the number of games since the maximum number of record violations for a given team is equal to the number of games played by that team.

Obviously, record violations will always number [sic] equal to or less than ranking violations since we begin with the rankings [sic] violations but allow some to cancel out others. The purpose of this metric is to prevent the obviously nonsensical situation mentioned above in my opening example. For this reason, I think it is a slightly superior metric. I would certainly love to see the results of it on your comparison page by year's end. If you do choose to employ this metric, I would also appreciate a reference. Lastly, I did not get a reply from you on my previous message. I know this is a busy time for you, so I understand...

Thanks for all your hard work in this most important field of endeavor (I say this tongue in cheek, of course).

Jim

----------

*For those who might run with the math, yes, if you consider the record violations for all three teams, you get a minimum of two violations for any of these orderings -- ABC, ACB, or CAB. The point is, record violations, unlike ranking violations, don't force you to one of the extremes.
**This was supposed to say #7.
***Opponents.

Saturday, November 15, 2008

The "Best" Team

How often do we hear fans complaining because the "best" team(s) didn't get to play in the conference championship, or the "best" team wasn't ranked number one, or the "best" team didn't make a BCS bowl. Hmmm. What does it mean to be the "best?" The problem is, if you don't think about it too much, it seems pretty easy. It's obvious. The "best" is the "best," right? How hard can it be?

If you're content to have "best" be simple and obvious, skip the rest of this. Otherwise, read on...

Is the "best" the team that on average did better than any other over the entire season? Is an opening loss as bad as losing the last game of the season? What if your team has Heisman contenders at QB and RB and they both get injured in the waning seconds as your team wins the final game of the regular season? Better yet, what if they went undefeated against the toughest schedule in the country? Are they still the "best" team -- right now, that is? Have they earned the right to play for the national championship anyway, even if their star players will be watching from the sidelines? And what about consistency? Team A plays a very tough schedule and beats every opponent by less than a touchdown. Team B plays the same schedule, whips every opponent by four touchdowns except one who beats them by a field goal in overtime. Which one is best, A or B? If scoring matters, then can you make up for a loss one week by running up the score next week? If it doesn't matter, then why do we invoke it so often when trying to prove our case about who is better? Why do we appeal to it as a "tiebreaker" when W/L and SoS aren't enough? Lots of questions. How about some answers...

The bottom line, in my opinion, is that there should be a standard. Otherwise, you have something like this...

You're taking a class at school. Your teacher informs you that in the upcoming test, problem #1 will count 90% of your grade. On test day, you skip #1 and work all the other problems. When the graded test comes back, you have a 10% grade -- you aced all of the problems you worked. Now you complain -- "but I got ALL BUT ONE of the problems right." "Doesn't matter," the teacher says, "the standard is what it is." So would it be better to have no standard? You have no idea what the teacher wants. He might only give credit for spelling your name right, or maybe you'll get points for turning in a blank test so that he can reuse it next year. You really have no idea what it is you're supposed to do.

At least when there's a standard, it is at least fair, and no one really has a right to complain. To strive to achieve in areas the standard does not emphasize is simply to fail.

Let's look at it another way... In each football game, we have a standard -- the team with the most points wins. There are no points for yards, takeaways, completed passes, fewest penalties, etc. The standard is clear -- most points wins. To do anything else and then complain about it is ridiculous.

Before I take the next step, let me state this clearly -- the BCS has been a huge step in the right direction. "Yeah, but wouldn't a playoff be better?" you ask. Well, the BCS IS A PLAYOFF. Think about it. Playoffs are when you select (by whatever standard) some number of the "best" teams and let them "play-off" until only one remains. Before the BCS, that number was ZERO. With the BCS, it is now TWO. That's a step in the right direction, right? Would four be better? I think so. Eight? Maybe. The top ten where six get a bye? I'd consider it.

Now back to the standard. We're talking about COLLEGE football, right? Colleges. Places where there are supposed to be a lot of smart people, right? Couldn't all of those smart people figure out some absolute standard they could all agree to. One that's full and open. Granted, it wouldn't be quite as simple as what we have in an individual game (most points wins), but if we had a "formula," if you will, that everyone agreed to, then there would be no questions about what needed to be done. The college computer science departments could run what-if scenarios and know ahead of time who needed to beat whom to achieve a certain rank, or make the playoffs, or make the championship game. I could go on, but I'll resist the temptation and stop here.

Tuesday, November 11, 2008

Overtime Alternative

Before I start... a warning.  Your first reaction to this suggestion will probably not be a good one.  Let me suggest you chew on it a little before rejecting it outright.  Here goes...

If the clock runs out in regulation with the score tied, turn the clock off, continue playing (i.e., no coin toss and kickoff), and play to sudden death (first to score wins).  There, I told you that you wouldn't like it.  Now let's kick it around a bit.

Situation A:  One minute left.  Team A has the ball, is trailing by 3 points, and chooses to play for a tie.  If they tie the game (in regulation), their opponent will NOT have to work against the clock -- they can take as long as they like to try to make the go-ahead score.  Once the clock runs out, first to score wins.

Situation B:  One minute left.  Team A has the ball, is trailing by 3 points, and chooses to play for the win.  If they take the lead (in regulation), their opponent must tie or go ahead before the clock runs out.

What are the advantages?  First of all, by removing the coin toss to start overtime, you remove a random and arguably unfair element of the game.  In regulation, each team got to receive once -- fair enough.  Winning the coin toss gives such an advantage, why not just use the coin toss alone to decide the winner?  Second, you have new elements of strategy to consider (see above).  Third, in a tie, the team with the ball at the end of regulation gets to keep it -- why not reward them for having possession at that point?

Feedback is welcome, but be sure to mull it over a bit first.