If you were to wander the halls of a courthouse during a murder trail, could you predict the verdict from the conversations you would overhear? And what would be the smallest amount of information you would require to make that prediction?
Discovering patterns in information is more than a game of courtroom prescience; it is a serious matter with applications in warfare, stock markets, human health, and other complex systems.
Finding a reliable technique for detecting such patterns, however, is difficult. SFI Research Fellow Simon DeDeo, SFI Graduate Fellow Sara Klingenstein, and undergraduate researcher Robert Hawkins are drawing on information theory and a couple of remarkable data sets for help.
In one example, the researchers analyzed some 250 years of transcripts from the Old Bailey criminal court in England to look for patterns in trials that led to guilty verdicts.
Image caption: Year-to-year trial patterns from the Old Bailey criminal court in London, England, with trials clustered by semantic structure. Each of the four rows represents a cluster, ordered by dominance, with the most common form of trial at the top. From year to year, the trial clusters are semantically related to each other. The system's structure shifts slowly, on decade-long timescales. When two clusters are semantically related in neighboring years, a heavy black line connects the two. (Credit: Simon DeDeo, Sara Klingenstein, and Tim Hitchcock)
“Courthouses are fundamentally an information processing system trying to come to a verdict,” says DeDeo.
In their analyses, which used techniques from information theory to place strict bounds on the predictive ability of their outcomes, the researchers found evidence for different distinct trial patterns: in other words, more than one pathway to a guilty verdict.
“The system is not just processing information, but doing so in a structured fashion,” he says, “with separate and non-overlapping pathways through the decision process.”
In a second case they studied five years of WikiLeaks-published military reports about insurgent attacks in Afganistan – data about locations, durations, combatants, and more.
“Insurgency is not just about violence,” says DeDeo, “but also about signaling and coordination with rivals.” An insurgent group's attack is a message to rival insurgent groups, to NATO forces, or to the civilian population.
“In the South, some provinces were predictive of what would happen later in others, but not vice versa,” DeDeo explains. This asymmetry suggests an underlying structure to a highly-dispersed conflict.
“In contrast to the ad hoc and fragile nature of many other methods of analysis, information theory provides an explicit and robust framework of assumptions that help you do the science,” DeDeo says.
Their paper is to appear in a special issue of the journal Entropy.
Read their paper on Arxiv (February 5, 2013)