On 25.04.2017 05:36, Snehal Shekatkar wrote:
Dear all,
Just as a test, I was trying to fit a degree-corrected SBM to the famous "karate-club" network. Almost everybody has "shown" that the network contains two communities. However, when I tried using in-built graph-tool algorithms, I can see only one:
I don't think most of them have shown such a thing. Many methods _find_ two communities, and many _assume_ that makes sense, but that is not _showing_ anything. Note that most methods would happily partition a fully random network.
I tried this repeatedly, but I always get only one group. Then I tried manually specifying 2 groups:
In [11]: state = gt.minimize_blockmodel_dl(G, deg_corr=True, B_min=2)
However, in this case, I get a strange partition as shown in the following figure:
This partition is not strange, it is a division that separates the administrator and instructor from the rest. Is one local maximum of the SBM posterior. However, if you keep running the algorithm many times, you eventually see the partition that you were expecting. (Remember that the algorithm is stochastic). If you look at the description length corresponding to each partition, you get: B = 1, S = 222.6858 B = 2, S = 226.9818 (hubs vs. rest) B = 2, S = 228.2842 (two communities, "conventional") The posterior odds ratio of the B=1 vs the B=2 partitions is 0.014 and 0.004, so the B = 2 partitions cannot be fully rejected. This means that there _is_ evidence for _any of the two_ partitions. However, it seems that the most likely model is indeed a configuration model, B=1. And the only way one could _show_ otherwise is to come up with a B>1 model that reduces the description length below 222.6858 nats. Note that the over-reliance on the karate club to judge the quality of algorithms is highly questionable, and often the subject of jokes. See: http://www.sunclipse.org/wp-content/downloads/2010/09/karate-club-shirt.jpg and the venerable Karate Club Club: http://networkkarate.tumblr.com/ Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de>