Tiago de Paula Peixoto wrote:
Am 01.09.21 um 19:24 schrieb Davide Cittaro:
If we assume that the big graph is sampled from a SBM, then the sub-sampled graph would also be sampled from a SBM, but not from the same one, if we are dealing with sparse networks. The sub-sampled SBM would be sparser (smaller average degree), and have a deformed degree distribution in the case of the DC-SBM.
Since my graphs are kNN graphs, would you suggest to recompute them on subsampled data? To be fair I’ve tried and results do not change dramatically
The intuition here is that the evidence for the underlying structure will become weaker after sub-sampling, according to how sparser the network becomes. With the MDL/Bayesian approach in graph-tool, you should see fewer groups in the sub-sampled network, but they should otherwise be similar to the full network.
This is indeed what I observe. I’m thinking to use this possibility to “sniff” the data and, in case needed, one can (and should) use the full network. Also, I’m aware subsampling will generally wipe out small communities which won’t be identified