Can a computer predict the winner of the NCAA basketball tournament, called March Madness by fans and the NCAA corporate juggernaut alike?
If so, Science 2.0 will save you some time: third seed Florida will be your national champion.
The prediction is from Georgia Tech's Logistic Regression/Markov Chain (LRMC) college basketball ranking system, a computerized model that has correctly chosen the men's basketball national champ in three of the last five years.
The LRMC predicts that Florida, Louisville, Indiana and Gonzaga are most likely to advance to the Final Four in Atlanta, with Florida and Gonzaga playing for the title on Monday, April 8. It's the first time in the LRMC's 10-year history a team that isn't a number one seed is picked to win the title.
Joel Sokol, a professor in Georgia Tech's School of Industrial and Systems Engineering (ISYE) whose research specialties include sports analytics and applied operations research, oversees the annual project. During the season, the LRMC uses basic scoreboard data to create a weekly ranking of all 347 Division I NCAA teams. The mathematical formula looks at every game and factors in the margin of victory and where each game is played. When the field of 68 was announced last Sunday, Sokol's team released its bracket.
Last year, the team presented a paper that shows the LRMC has been the most accurate predictive ranking system over the last 10 years. The model outperformed more than 80 others, including the NCAA's Ratings Performance Index (RPI), the system most experts use to justify who should and shouldn't get into the tournament.
"Our system combines the aspects of performance and strength of schedule by rewarding game performance differently according to the quality of each opponent," said Sokol. "Compared to something like RPI, LRMC is able to predict which team is better by taking the margins of victories and losses into account."
The LRMC identifies which team is most likely to win each game. However, upsets sometime get in the way – in fact, about 25 percent of all NCAA tournament games are upsets. If you're trying to find this year's Cinderella, Sokol says Bucknell, Davidson, Belmont and St. Mary's are the most likely "small schools" to make the Sweet Sixteen. Memphis, UCLA and Butler are the teams most in danger of being eliminated early (each is seeded sixth).
Aside from picking tournament winners, the LRMC has also been used through the years to dispel a few myths. For example, in the long run, certain teams don't have big home court advantages. Almost all home courts are about the same.
"The reason that you hear people say things like 'Duke is one of the toughest home courts - it's so hard to win there' isn't because of the court or the fans," said Sokol. "It's that Duke is usually such a good team. When you give them even a three- or four-point home court advantage on top of the skill advantage they usually have, it's hard to overcome."
Also debunked is the popular belief that "good teams know how to win close games." Sokol's team looked at home-and-home conference results through the years.
"If the cliché was true, teams that won close games at home would have a significantly higher winning percentage in the road rematch than teams that lost close games at home," he said. But close home winners won about 35 percent of their road rematches. Close home losers won about 33 percent.
When the NCAA was considering expanding the tournament to 96 teams, Sokol also used LRMC simulations to point out that the dramatic upsets fans love to see would decrease by a factor of five, potentially leading to a sharp decrease in fan interest.
Sokol is joined on the LRMC team by fellow ISyE Professors Paul Kvam and George Nemhauser, as well as Professor Mark Brown of City College, City University of New York.