Inference for the "karate network"
Dear all, Just as a test, I was trying to fit a degree-corrected SBM to the famous "karate-club" network. Almost everybody has "shown" that the network contains two communities. However, when I tried using in-built graph-tool algorithms, I can see only one: In [2]: G = gt.collection.data["karate"] In [3]: state = gt.minimize_blockmodel_dl(G, deg_corr=True) In [4]: state Out[4]: <BlockState object with 1 blocks (1 nonempty), degree-corrected, for graph <Graph object, undirected, with 34 vertices and 78 edges at 0xa288858c>, at 0xa288b9cc> I tried this repeatedly, but I always get only one group. Then I tried manually specifying 2 groups: In [11]: state = gt.minimize_blockmodel_dl(G, deg_corr=True, B_min=2) However, in this case, I get a strange partition as shown in the following figure: This is not even close to what many papers show. I appreciate that many algorithms consider noise as the real structure. However, even the famous paper by Karrer and Newman (Karrer, Brian, and Mark EJ Newman. "Stochastic blockmodels and community structure in networks." *Physical Review E* 83.1 (2011): 016107.) describes a very different division of the network. Am I missing something very fundamental here? Thank you Snehal Shekatkar -- Snehal M. Shekatkar Pune India
Dear Snehal, I am not sure what others would say, but, in your case, I am pretty sure that you have used the overlapping version of SBM. You will not observe the core-periphery structure fitted by the vanilla SBM, nor the assortative structure via the degree-corrected SBM. Try: state = gt.minimize_blockmodel_dl(G, deg_corr=True, overlap=False, B_min=2) And visualize it via: state.draw(pos=G.vp["pos"], vertex_shape=state.get_blocks()) You may get what you expected. Also, note that if you tried (without the minimum block number constraint), state = gt.minimize_blockmodel_dl(G, deg_corr=True, overlap=False) You will still obtain a ground state with only one non-empty group. This is valid since it is a state with the minimum description length (MDL) of both the model and the data. See this paper for details about MDL: http://dx.doi.org/10.1103/PhysRevLett.110.148701 Note that people have argued whether it is necessary to introduce the additional Order(N) parameters in the SBM to fit such a small karate club data, which introduces a lot of model complexity. Indeed, if one used a full Bayesian treatment and study the relation of the model likelihood to the different numbers of blocks used. For the karate network, one may hesitate which number of partitions to use. See the Fig.2(a) of this paper: https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.117.078301 Finally, the karate network is not generated by any model of the SBM-family. So it really takes interpretations between the model and the data! Best regards, Tzu-Chi Yen -- View this message in context: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/... Sent from the Main discussion list for the graph-tool project mailing list archive at Nabble.com.
Thanks for the reply. However, unfortunately this doesn't help to me. I have seen those papers and I think the algorithm in the first paper is in-built in graph-tool. Also, overlap = False is the default argument for the minimize_blockmodel and hence what I was doing is no different than what you said. Snehal On Tue, Apr 25, 2017 at 11:10 AM, Tzu-Chi Yen <junipertcy@gmail.com> wrote:
Dear Snehal,
I am not sure what others would say, but, in your case, I am pretty sure that you have used the overlapping version of SBM. You will not observe the core-periphery structure fitted by the vanilla SBM, nor the assortative structure via the degree-corrected SBM.
Try: state = gt.minimize_blockmodel_dl(G, deg_corr=True, overlap=False, B_min=2) And visualize it via: state.draw(pos=G.vp["pos"], vertex_shape=state.get_blocks())
You may get what you expected.
Also, note that if you tried (without the minimum block number constraint), state = gt.minimize_blockmodel_dl(G, deg_corr=True, overlap=False)
You will still obtain a ground state with only one non-empty group. This is valid since it is a state with the minimum description length (MDL) of both the model and the data. See this paper for details about MDL: http://dx.doi.org/10.1103/PhysRevLett.110.148701
Note that people have argued whether it is necessary to introduce the additional Order(N) parameters in the SBM to fit such a small karate club data, which introduces a lot of model complexity. Indeed, if one used a full Bayesian treatment and study the relation of the model likelihood to the different numbers of blocks used. For the karate network, one may hesitate which number of partitions to use. See the Fig.2(a) of this paper: https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.117.078301
Finally, the karate network is not generated by any model of the SBM-family. So it really takes interpretations between the model and the data!
Best regards, Tzu-Chi Yen
-- View this message in context: http://main-discussion-list- for-the-graph-tool-project.982480.n3.nabble.com/Inference-for-the-karate- network-tp4027197p4027198.html Sent from the Main discussion list for the graph-tool project mailing list archive at Nabble.com. _______________________________________________ graph-tool mailing list graph-tool@skewed.de https://lists.skewed.de/mailman/listinfo/graph-tool
-- Snehal M. Shekatkar Pune India
Hmm. What about if the documentation indeed says overlap=True as the default? See this link: https://graph-tool.skewed.de/static/doc/inference.html Have you tried and observed a difference if you explicitly set "overlap" to "False"? Does this help? If not, could you rephrase your question? Bests, Tzu-Chi Yen -- View this message in context: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/... Sent from the Main discussion list for the graph-tool project mailing list archive at Nabble.com.
On 25.04.2017 10:42, Tzu-Chi Yen wrote:
Hmm. What about if the documentation indeed says overlap=True as the default? See this link: https://graph-tool.skewed.de/static/doc/inference.html
Oops, this is a bug in the documentation. The default is indeed overlap=False, as the function signature says. I'll fix it soon. Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de>
On 25.04.2017 05:36, Snehal Shekatkar wrote:
Dear all,
Just as a test, I was trying to fit a degree-corrected SBM to the famous "karate-club" network. Almost everybody has "shown" that the network contains two communities. However, when I tried using in-built graph-tool algorithms, I can see only one:
I don't think most of them have shown such a thing. Many methods _find_ two communities, and many _assume_ that makes sense, but that is not _showing_ anything. Note that most methods would happily partition a fully random network.
I tried this repeatedly, but I always get only one group. Then I tried manually specifying 2 groups:
In [11]: state = gt.minimize_blockmodel_dl(G, deg_corr=True, B_min=2)
However, in this case, I get a strange partition as shown in the following figure:
This partition is not strange, it is a division that separates the administrator and instructor from the rest. Is one local maximum of the SBM posterior. However, if you keep running the algorithm many times, you eventually see the partition that you were expecting. (Remember that the algorithm is stochastic). If you look at the description length corresponding to each partition, you get: B = 1, S = 222.6858 B = 2, S = 226.9818 (hubs vs. rest) B = 2, S = 228.2842 (two communities, "conventional") The posterior odds ratio of the B=1 vs the B=2 partitions is 0.014 and 0.004, so the B = 2 partitions cannot be fully rejected. This means that there _is_ evidence for _any of the two_ partitions. However, it seems that the most likely model is indeed a configuration model, B=1. And the only way one could _show_ otherwise is to come up with a B>1 model that reduces the description length below 222.6858 nats. Note that the over-reliance on the karate club to judge the quality of algorithms is highly questionable, and often the subject of jokes. See: http://www.sunclipse.org/wp-content/downloads/2010/09/karate-club-shirt.jpg and the venerable Karate Club Club: http://networkkarate.tumblr.com/ Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de>
participants (3)
-
Snehal Shekatkar -
Tiago de Paula Peixoto -
Tzu-Chi Yen