## Tuesday, April 7, 2009

### NCAA Championship: A Study in Statistics and Multi-Dimensional Parameter Space

As most of you may have heard UNC won the NCAA Basketball championship yesterday. To some people this came as no surprise as they were favored to win. And considering how they played in the first two rounds it also comes as no surprise. So now here comes the critical question, did they win because they were good or because they were statistically preselected by the NCAA championship committee to win?

Previously I have written on conspiracy theories ranging from a US invasion of Canada to the sudden drop in oil prices last year. So now I present my latest conspiracy theory: UNC was selected to win the NCAA championship at the beginning of of March, if not sooner.

You are probably aware of the fact that computer scientists all over the US spend an inordinate amount of time and effort writing a program that will analyze each team and select who will win each game and thus who will ultimately win the championship. The reasons for this are simple, on the one hand it presents and interesting statistical problem in multi-dimensional parameter space that can be solved. On the other hand it is a good way to make money. If you can write a good program that can predict who will win, then you can bet on that team and make a lot of money. It's the same reason people study the statistics of gambling (in one sense this is studying the statistics of gambling). These people who bet on their team obviously don't want to loose money, so they make sure that their program works best.

Now consider that it isn't just college professors betting on online brackets, but now you are a corporation betting on advertising revenue (ESPN, CBS, NCAA, etc.) but now you are not just trying to predict the outcome of a random event, but rather you are trying to maximize your returns, and you have some influence over these seemingly "random events". So lets say you get your hands on one of these predictive programs that seems to work fairly well, but the bracket and tournament has not been decided yet. You have several teams to choose from, to put into 16 slots. You can also determine how these 16 teams will go into the slots.

So you input the data and you see that team A is good at offense OK at defense and good at free throws. Team B is OK at defense but good at offense and free throws. If these two teams played they would be evenly matched, but team A tends to foul more than team B, thus if they played, team B would be more likely to win. So you look at this multi-dimensional parameter space and while there are a number of variables, they all can be reduced to one variable, the score. This makes it immensely simple. So now you look at team C and see that they are good at offense OK at defense and OK at free throws. Now we see that if team A plays team C, the fact that team A tends to foul no longer matters because team C can't make free throws. Thus if you seed team A against team B, team B is guaranteed to win, but if you seed team A against team C it could go either way and come down to a buzzer shot.

As long as the people organizing the bracket have no personal interest in the matter (i.e. they don't or can't bet, and they don't hold personal feelings towards team B) then they just might do it randomly. But now let us introduce another parameter. The organizers know that if they have team A and team B play, team B will win. They know that if team C and team D play, team D will win. Which means team B will play team D in the next round. If you switch the original bracket, then you may have team C playing either team D or A in the final. Now here's the catch. If you have B and D play in the final you can get more sponsors and you get more money. So if all you care about is revenue from sponsorship then it would be easy to "fix" the bracket in order to maximize your revenue.

As a real example of maximizing revenue. Say you have a football (soccer) championship. Would you want the championship game to be played between Newell's Old Boys and Chaco Forever, or would you want the championship game to be played between Real Madrid and Manchester United? (note: Chaco Forever doesn't even have a Wikipedia page...oh wait! I found it, it was spelled wrong) So how much money could be made either way? If you could affect the outcome, and make a little money would you do it? Especially if it appears for all intents and purposes to be legitimate and random? (But really isn't).

So here is my conspiracy theory, whenever it became obvious that they could organize the bracket so that UNC could win, and that they would maximize their advertising revenue that way, the set the bracket and UNC was guaranteed a win. Looking at the point spread for the entire tournament, it has the appearance of being rigged. Just a thought.

Now for your viewing pleasure: This is the celebration that took place on Franklin Street after the game. This is the corner of Columbia and Franklin Streets, which is about a block from the Physics Department. We also have to drive down Columbia Street every time we go to church, to give you some reference.

Timelapse: Franklin Street after the victory from The Daily Tar Heel on Vimeo.

And another video:

Franklin Street: The Celebration from The Daily Tar Heel on Vimeo.