When you sample from the posterior and take the vertex marginals, is it
proper to say that we can interpret the marginals for a given vertex as
being the degree of membership in the communities (fuzzy community
If so, how does this differ from the overlapping blockstate? I saw in the
mailing list that overlapping is only supported at the base level:
But, even if it were supported at every level, what does this achieve that
the fuzzy model averaging doesn't? Could you do model averaging with the
overlapping state too? E.g., in sample 1 vertex A is in communities c1, c2.
In sample 2 vertex A is in communities c1, c4. Etc. Would this be in some
way a more accurate measure of multiple community membership than the fuzzy?
I've been wrestling with long compile times in graph-tool while doing some
development work, and I noticed that template instantiations seem to be an
important factor. Even functions of modest compile time can be very slow if
they are multiplied by the product of several type alternatives, each of
which much be instantiated. For example, a single function in Python may
dispatch to N graph types x M degree map types x P weight map types, which
can mean hundreds of instantiations.
At the same time I noticed a pattern in the code where there is a set of
types (represented by a Boost.MPL typelist) that accompany a boost::any
argument. At dispatch time the "any" object is interrogated to find out
which of the types it stores, and the correct instantiation is called.
Dispatching this way requires a linear search through the typelist for each
It seems to me that replacing (MPL typelist + boost::any) with std::variant
would improve both of those factors with:
1. Constant time dispatch to the appropriate instantiation via std::visit
2. The ability to reduce compile time by reducing the N x M... type
product, depending on how well the argument use can be refactored.
I think the approach described in
be applied here.
Would there be any interest in exploring this kind of refactoring? I think
there could be substantial benefits in compile time, as well as some
runtime improvement (depending on how often the Python/C++ boundary is
Hi, I'm running the nested version of nSBM, I'm collecting the group
marginals using the code from gt documentation, basically counting the
number of non empty blocks for each hierarchy level for each iteration:
group_marginals = [np.zeros(g.num_vertices() + 1) for s in
levels = s.get_levels()
for l, sl in enumerate(levels):
group_marginals[l][sl.get_nonempty_B()] += 1
At the end of the equilibration I look at the distributions and, in general,
the most probable number of blocks at each level is not the one that is
stored in the final state, although the final number of blocks is typically
the second most probable. I may be naive, but I expected the two to be the