Am 01.04.20 um 02:40 schrieb Deklan Webster:
I was under the impression that passing a list corresponds to getting the probability that *all* the edges are missing. Indeed, when I try it out I get back a scalar not a np array. I want to collect the probability that each individual edge is missing.
Yes, this is true.
Also, with respect to the heuristics I mention, I just saw this paper "Evaluating Overfit and Underfit in Models of Network Community Structure" use "s_ij = θ_i *θ_ j* l_gi,gj"
This is not substantially faster that what is actually computed in graph-tool, it is just less accurate.
If sampling is not computationally feasible, this is what I had in mind.
1) Is there a way built into graph-tool to compute this similarity function efficiently? (i.e., without Python slowing me down)
You should switch to using MeasuredBlockState.get_edge_prob() which is implemented entirely in C++. The whole approach described in https://graph-tool.skewed.de/static/doc/demos/inference/inference.html#measu... should be preferred over the older binary classification scheme, as it completely subsumes it.
2) Is there a hierarchical analog, like just summing this similarity at each level?
Yes, any of the above approaches work in the exact say way with the hierarchical model. Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de>