Graph Sampling Through Graph Decomposition and Reconstruction Based on Kronecker Graphs
Shen Lu, Les Piegl, Richard S. Segall
Authors Information |
Citation |
Full Text |
Shen Lu
Department of Computer Science and Engineering, University of South Florida, Tampa, Florida, United States
Les Piegl
Department of Computer Science and Engineering, University of South Florida, Tampa, Florida, United States
Richard S. Segall
Department of Information Systems and Business Analytics, Arkansas State University, State University, Arkansas, United States
Cite this paper as:Lu, S., Piegl, L., Segall, R. S. (2022). Graph Sampling Through Graph Decomposition and Reconstruction Based on Kronecker Graphs.
Journal of Systemics, Cybernetics and Informatics, 20(2), 23-32. https://doi.org/10.54808/JSCI.20.02.23
Online ISSN (Journal): 1690-4524
Abstract
The connectedness of the social network gives rise to a new challenge of how to efficiently sample the network and keep the graph properties and topology properties as well. Inspired by R-MAT and Kronecker graph generators, based on the observation of different graph topology types, we proposed to use Kronecker graph as the prime graph to conduct Kronecker graph double cover through periphery subgraphs. First of all, the connectedness of the graph remains during graph merging. Secondly, only redundant vertices and edges are merged so that the characteristics of the graphs are kept. Also, graph merging only works on periphery subgraphs from low degrees to higher up so those topology properties are kept. Finally, although some edges are merged, since the similarity groups generated based on Kronecker graph similarity is independent of the degree distribution, Kronecker double cover operation does not affect the graph degree centrality measure. We theoretically prove the feasibility of the Kronecker double-cover operation and also compare the quality of the sample set with Snowball sampling and Es-i sampling sets. Experimental results show us that, when the separation of the core and periphery subgraphs is between mean (the average of the degrees) and mean+std (standard deviation of the degrees), the topology types and graph properties can be preserved. This conclusion confirms the existence of the topology types, and also proves the topology types of the real-world graphs are not random.