We use physics-inspired methods to find structure within large datasets.
We are in an age of information, with nearly every scientific field awash in new data. Thus, making sense of large sets of real-world data stands as a preeminent challenge for modern science. Massive data sets, whether they record food web relationships, online friendships, or distributions of utilities like electricity, are often described by mathematical network models that give structure to the data – and help us better understand the relationships hidden within it.
Our project aims to use physics-inspired methods to find structure within large data sets and determine when these structures are statistically significant. We are developing elegant, flexible, and computationally efficient algorithms for investigating the underlying structures, dynamics, and attributes of real-world networks.These physics-based algorithms can point to hidden connections between spatially disparate nodes of a network. They can help us understand why a natural disaster in a given area might cause an electrical blackout hundreds of miles away. They can reveal similar relationships in different data sets, such as the keystone species in a modern food web and those from the Cambrian period. They can fill in missing data with intelligent guesses, predict missing links, and tell us the probability that a given node belongs to a given community. Moreover, these algorithms are scalable, allowing us to solve massive problems, once the domain of supercomputers, on an ordinary laptop.
As we seek out the structures, patterns and attributes of large data sets, we also pursue the broader question of how a network’s structure gives rise to its dynamics. In doing so, we hope to understand the similarities and differences between social networks, economies, power grids, and food webs.
- James S. McDonnell Foundation
- John Templeton Foundation
- National Science Foundation