Re: [graph-tool] Latent Poisson questions

15 Apr 2020

      Am 15.04.20 um 05:40 schrieb Deklan Webster:
...
...
Most typically, edge prediction is
formulated as a binary classification task [7], in which each
missing (or spurious) edge is attributed a “score” (which
may or may not be a probability), so that those that reach a
prespecified discrimination threshold are classified as true
edges (or true nonedges)
I think this is misleading. This is not typical.  As far as I can tell,
very few people use unsupervised binary classification for link
prediction. Most typically, edge prediction is formulated as a
*supervised* binary classification task. From that setup, you can
calibrate probabilities based on ability to predict.
It's not misleading at all, this is exactly how a binary classifier
works, supervised or otherwise. How you find the threshold is besides
the point.
...
...
Indeed selecting the optimal threshold can be done by cross validation
in a *supervised* setting, but even then this will depend in general on
the fraction of removed edges, size of test set, etc.
I agree, this is a valid concern. If this is your objection to the
*supervised* binary classification formulation then I think this should
be the statement in the paper.
This is just another difference. And I wasn't "objecting", just
explaining what you had misunderstood.
...
Well, I have run it multiple times with different numbers. To be sure, I
just now ran it with the epsilon removed, 2000 wait, multiflip on, and
then 100k(!) sampling iterations. Results were pretty much the same.
It's a shame.
...
I noticed that in the docs you now recommend setting beta=np.inf when
using multiflip. What exactly is this doing? (I plan to soon read that
other paper you mention which probably includes this..)
You didn't even have to read any paper, just the docstring would have
been enough.

Beta is the inverse temperature parameter, and setting it to infinity
means turning the MCMC into a greedy optimization heuristic.

And I don't "recommend it". It is not applicable to your context.

To be honest, I think the pattern of saying "I plan to read your
documentation/paper at some point, but you could you please just explain
this to me before I do so" a bit disrespectful. Why is my time worth
less than yours?
...
I noticed that in your paper you didn't compare your reconstruction
method performance against any baseline. How do you know how well it's
performing if you don't have a baseline? I'm currently pessimistic given
its performance on the graph I'm testing. Some of those graphs you were
testing on in the paper are loaded into graph-tool, right? It would be
fairly easily to train a RandomForest (no fancy boosted trees necessary)
with the stacked similarity measures from graph-tool (and maybe a few
other simple features I have in mind...) and test the performance
against your reconstruction approach (at least for just link
prediction). Interested in this? Conjectures? I would be willing to do
it for some of the moderately-sized graphs.
This kind of comparison has been done already in
https://arxiv.org/abs/1802.10582 and https://arxiv.org/abs/1909.07578.
The SBM approach is the single best classifier among the over a hundred
they consider, which is marginally beat only by a stacking of around 40
other predictors.
In any case that was not the point of the PRX paper, which was to
develop an actual Bayesian reconstruction algorithm, not a binary
classifier. AFAIK there is no other algorithm that does this, so there
was nothing to compare to. If you're worried about comparing with binary
classifiers, you can just convert this approach into one by using the
marginal posterior probabilities as "scores" and go from there, as the
papers above do. Then you are comparing apples to apples.

If you have further questions about how to use the library, please go
ahead and ask. But if you want to discuss how to compare supervised vs
unsupervised edge prediction, etc, please take this off this list since
it's off-topic.

Best,
Tiago

-- 
Tiago de Paula Peixoto <tiago@skewed.de>