Bayesian estimation of the latent dimension and communities in stochastic blockmodels
Date:
Spectral embedding of adjacency or Laplacian matrices of undirected graphs is a common technique for representing networks in a lower dimensional latent space, with optimal theoretical guarantees. The embedding can be used to estimate the community structure of the network, using a random dot product graph interpretation of the stochastic blockmodel, with strong consistency results. One of the main limitations of standard algorithms for community detection from spectral embeddings is that the number of communities and the latent dimension of the embedding must be specified in advance. In this talk, a Bayesian model for simultaneous selection of the appropriate dimension of the latent space and the number of blocks is proposed. Extensions to directed and bipartite graphs are discussed. The model is tested on simulated and real world datasets, with particular focus on cyber-security applications, showing promising performance for recovering the known latent community structure in those networks.
Joint work with Professor Nick Heard (Imperial College London).