Am 06.09.18 um 12:37 schrieb JM:
I have a directed graph of about half a million nodes and approximately a million edges following scale free behaviour and a power law degree distribution. To test some of my hypothesis, I would like to generate random smaller graphs (about 50 up to 200 nodes) representative of the big one.
This is an open problem in network science, with no satisfactory solution. Nobody knows how to subsample sparse graphs in a meaningful way. Naive approaches like randomly subsampling nodes and/or edges lead to trivial and highly biased samples.
The code you sent, however, seems to subsample only the degree distribution, not the graph, which is something else entirely, and will not be representative of a larger graph in any other way. I'll not comment further on it, since as I always say, I need a minimal and *self-contained* program that shows whatever problem you might be encountering. Analyzing code fragments like this, decoupled from their context in the larger program, is not a good use of our time.
Best, Tiago