The world is rife with rankings and orderings. They show up in tennis — as in the French Open, which ends with a final ranking of champion players. They show up in pandemics — as when public health officials can record new infections and use contact tracing to sketch networks of COVID-19 spread. Systems of competition, conflict, and contagion can all give rise to hierarchies.
However, these hierarchies are observed after the fact. That makes it difficult to know the true rankings of the system: Who was actually the best player? Who infected whom? “You can’t go back in time and learn exactly how this thing happened,” says SFI Postdoctoral Fellow George Cantwell. One could build a model of the network and compare all possible outcomes, but such a brute-force approach quickly becomes untenable. If you were trying to rank some group with just 60 participants, for example, the number of possible permutations reaches the number of particles in the known universe.
For a recent paper published in Physical Review E, Cantwell collaborated with SFI Professor Cris Moore, a computer scientist and mathematician, to describe a new way to evaluate rankings. Their goal wasn’t to find one true hierarchy, but to calculate the spread of all possible hierarchies, with each one weighted by its probability.
“We were willing to not be exactly right, but we wanted to get good answers with some sense about how good they are,” Cantwell says. The new algorithm is inspired by physics: Ranks are modeled as interacting entities that can move up or down. Through that lens, the system then behaves like a physical system that can be analyzed using methods from spin glass theory.
Soon after the start of the COVID-19 pandemic, Cantwell and Moore began thinking about models of how disease spreads through a network. They quickly recognized the situation as an ordering problem that emerges over time, not unlike the spread of a meme on social media or the emergence of championship rankings in professional sports. “How do you order things when you have incomplete information?” asks Cantwell.
They started by imagining a function that could score a ranking on accuracy. For example: A good ranking would be one that agrees with the outcomes of matchups 98% of the time. A ranking that agrees with outcomes only 10% of the time would be lousy — worse than a coin flip without any prior knowledge.
One problem with rankings is that they’re typically discrete, which means they follow the whole numbers: 1, 2, 3, and so on. That ordering suggests that the “distance” between the first- and second-ranked members is the same as that between the second and third. But that’s not the case, says Cantwell. The top players in a game, worldwide, are going to be close together in terms of skill, so the difference between top-ranked players may be closer than it seems.
“You quite often see that lower-ranked players can beat higher-ranked players, and the only way the model can make sense and fit the data is by squishing all the ranks together,” says Cantwell.
Cantwell and Moore described a system that evaluates rankings based on a continuous numbering system. A ranking could assign any real number — whole number, fraction, infinitely repeating decimal — to a player in the network. “Continuous numbers are easier to work with,” Cantwell says, and those continuous numbers can still be translated back to discrete rankings.
In addition, this new approach can be used for predicting something about the future, like the outcome of a tennis tournament, and also inferring something about the past, such as how a disease has spread. “These rankings could tell us the order of sports teams from best to worst. But they could also tell us the order in which people in a community became infected with a disease,” says Moore. “Even before his postdoc, George was working on this problem as a way to improve contact tracing in an epidemic. Just as we can predict which team will win a game, we can infer which of two people infected the other when they came in contact with each other.”
In future work, the researchers say they plan to investigate some of the deeper questions that have emerged. More than one ranking might agree with data but disagree radically with other rankings, for example. Or a ranking that seems incorrect may have high uncertainty but not be inaccurate. Cantwell says he also wants to compare the model’s predictions to outcomes from real-world competitions. Ultimately, he says, the model might be used to improve predictions in a wide range of systems that lead to rankings, from infectious disease models to sports betting.
Cantwell says he’ll hold on to his money — for now. “I’m not quite ready to start betting on it,” he says.
Read the paper, "Belief propagation for permutations, rankings, and partial orders," in Physical Review E, doi:10.1103/PhysRevE.105.L052303