Leto Peel (University of Colorado, Boulder)
Abstract. We consider a network in which we know how the nodes are connected, but we do not know the class labels of the nodes. We wish to identify the best subset of nodes to label, in order to most accurately predict the class labels of the rest of the nodes. In contrast to previous work, we do not assume that nodes with the same class label all connect to the rest of the network in the same way. Instead, we model network roles separately to node classes, allowing nodes with the same class label to connect to the rest of the network in different ways. We present a model in which we identify network roles using a generative blockmodel, and predict labels by learning the relationship between network roles and class labels using a maximum margin classifier. We choose a subset of nodes to label according to an iterative margin-based active learning strategy. By integrating the discovery of network roles with the classifierer optimisation, the active learning process can adapt the network roles to better represent the network for node classification. We demonstrate the model and the active learning algorithm by exploring a selection of real world networks, including a marine food web and a network of English words. We show that, in contrast to previous work, this model performs well even when nodes with the same class label do not connect to other nodes in the same way. In addition, this approach results in improved computational efficiency.