Model selection in spectral graph clustering under the stochastic blockmodel
Date:
This talk discussed some of the work in my PhD thesis, which was awarded the 2022 Distinguished Dissertation Award (DDA) by the Classification Society.
This talk is divided in two parts: first, an overview is given about the PhD thesis ‘’Latent factor representations of dynamic networks with applications in cyber-security’’, discussing issues around modelling of individual events, graph clustering, and link prediction. Challenges related to clustering and classification within the context of computer networks are described. Second, two novel approaches for spectral graph clustering are discussed. Spectral clustering is a popular method for community detection in networks. The nodes are clustered on a low dimensional representation of the graph, called spectral embedding, resulting from a truncated spectral decomposition of the adjacency matrix or one of its regularised versions. One of the main limitations of standard algorithms for community detection from spectral embeddings is that the number of communities and the latent dimension of the embedding must be specified in advance. Estimating the number of communities and the dimensionality of the reduced latent space is crucial for a good performance of the spectral clustering algorithm. Two model-based methods for simultaneous and automated selection of the number of communities and latent dimensionality in spectral clustering under the stochastic blockmodel (SBM) and its degree-corrected version (DCSBM) are proposed. Results show improved performance over competing methods on simulated and real-world computer network data.