Modelling with community structure data
Dear Tiago / community, We are using the stochastic block model in a research project and trying to formulate how we would best utilise the community structure results downstream, and would welcome any suggestions. We fit a nested stochastic block model with an edge weight, which gives us a hierarchical partition. From this, we want to report not just the underlying community structure, but also some form of corresponding weights of given blocks, how ‘important' a given block is with respect to the edge weight, say. Considering an example on the weighted foodweb network: import graph_tool.all as gt; import numpy as np; import matplotlib; g = gt.collection.ns["foodweb_baywet"] state = gt.minimize_nested_blockmodel_dl(g, state_args=dict(recs=[g.ep.weight], rec_types=["real-exponential"])) This yields a hierarchical community structure, but how would you most suitably determine what communities were ‘most’ or ‘least’ important/influential/correlated with respect to the edge weight? I have considered whether this might be done with centrality metrics on the blocks (or perhaps vcount and ecount data from a condensation graph on the hierarchical blocks), but was keen to see if you had a more innovative idea... Thank you for your advice! James _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
Am 23.11.21 um 15:46 schrieb James Ruffle:
This yields a hierarchical community structure, but how would you most suitably determine what communities were ‘most’ or ‘least’ important/influential/correlated with respect to the edge weight? I have considered whether this might be done with centrality metrics on the blocks (or perhaps vcount and ecount data from a condensation graph on the hierarchical blocks), but was keen to see if you had a more innovative idea...
It is impossible to answer this kind of question absent of a very specific context and objective in mind. One of the biggest sins in network science is the proliferation of centrality metrics that attempt to define which node is "best" or "most important" as if there was a general answer to this question. So, I can't tell you which community is most "important"; you have to tell me what you mean by this. Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de> _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
Hi Tiago, Apologies for not being clearer. Let me try and make my example more specific: I have a network defined from brain imaging and am passing an edge weight of a given clinical variable. In this example the nodes are voxels of brain tissue, the edges are the presence of the voxels being structurally connected in imaging space, and the edge weight is the relationship of this to a clinical variable, a weight which incorporates ageing. I want to firstly derive the community structure, passing the edge weight, which ultimately gives me clusters of voxels. But, in addition I want to derive some formulation of a weight for the community blocks for their relation to the passed edge weight. For instance, in this example I would want a block which contains voxels within the hippocampus to be negatively associated to an age weight given atrophy associated with age, but a block containing voxels of the ventricular system to be positively associated as they will enlarge with age. How would you go about doing this? BW James On 23 Nov 2021, at 14:53, Tiago de Paula Peixoto <tiago@skewed.de<mailto:tiago@skewed.de>> wrote: Am 23.11.21 um 15:46 schrieb James Ruffle: This yields a hierarchical community structure, but how would you most suitably determine what communities were ‘most’ or ‘least’ important/influential/correlated with respect to the edge weight? I have considered whether this might be done with centrality metrics on the blocks (or perhaps vcount and ecount data from a condensation graph on the hierarchical blocks), but was keen to see if you had a more innovative idea... It is impossible to answer this kind of question absent of a very specific context and objective in mind. One of the biggest sins in network science is the proliferation of centrality metrics that attempt to define which node is "best" or "most important" as if there was a general answer to this question. So, I can't tell you which community is most "important"; you have to tell me what you mean by this. Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de<mailto:tiago@skewed.de>> _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de<mailto:graph-tool@skewed.de> To unsubscribe send an email to graph-tool-leave@skewed.de<mailto:graph-tool-leave@skewed.de> _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
Hi James, I’m wondering why you use the Stochastic Blockmodel if you want to identify community structure? The SBM groups nodes with similar positions but, roughly speaking, does not maximize the number of edges that constitute communities. Best Haiko _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
Dear James, Am 23.11.21 um 16:07 schrieb James Ruffle:
Hi Tiago,
Apologies for not being clearer. Let me try and make my example more specific:
I have a network defined from brain imaging and am passing an edge weight of a given clinical variable. In this example the nodes are voxels of brain tissue, the edges are the presence of the voxels being structurally connected in imaging space, and the edge weight is the relationship of this to a clinical variable, a weight which incorporates ageing. I want to firstly derive the community structure, passing the edge weight, which ultimately gives me clusters of voxels. But, in addition I want to derive some formulation of a weight for the community blocks for their relation to the passed edge weight. For instance, in this example I would want a block which contains voxels within the hippocampus to be negatively associated to an age weight given atrophy associated with age, but a block containing voxels of the ventricular system to be positively associated as they will enlarge with age.
How would you go about doing this?
The seemingly obvious answer is to look at the distribution of edge covariates on edges incident on the groups. But it is still not very clear exactly what you want to find. In any case, this is a question about a particular research problem, so I don't believe it is appropriate for this list, which is about using graph-tool. Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de> _______________________________________________ graph-tool mailing list -- graph-tool@skewed.de To unsubscribe send an email to graph-tool-leave@skewed.de
participants (3)
-
James Ruffle -
Lietz, Haiko -
Tiago de Paula Peixoto