Cristopher Moore, Pan Zhang
Paper #: 14-04-009
Modularity is a popular measure of community structure. However, maximizing the modularity can lead to many competing partitions with almost the same modularity that are poorly correlated to each other; it can also overfit, producing illusory "communities" in random graphs where none exist. We address this problem by using the modularity as a Hamiltonian, and computing the marginals of the resulting Gibbs distribution. If we assign each node to its most-likely community under these marginals, we claim that, unlike the ground state, the resulting partition is a good measure of statistically-significant community structure. We propose an efficient Belief Propagation (BP) algorithm to compute these marginals. In ran- dom networks with no true communities, the system has two phases as we vary the temperature: a paramagnetic phase where all marginals are equal, and a spin glass phase where BP fails to converge. In networks with real community structure, there is an additional retrieval phase where BP converges, and where the marginals are strongly correlated with the underlying communities. We show analytically and numerically that the proposed algorithm works all the way down to the detectability transition in networks generated by the stochastic block model. We also show that our algorithm performs well on real-world networks, revealing large communities in some networks where previous work has claimed no communities exist. Finally we show that by applying our algorithm recursively, subdividing communities until no statistically-significant subcommunities can be found, we can detect hierarchical structure in real-world networks more efficiently than previous methods. Our algorithm is highly scalable, working in time nearly linear in the number of edges: for networks with 105 nodes and 106 edges, for instance, it takes 14 seconds to find community structure.