Re: [graph-tool] Latent Poisson questions

11 Apr 2020

      Am 11.04.20 um 07:24 schrieb Deklan Webster:
...
...
It's a sum in log-space, so it's incremental.
Can you show me in code how to use logsumexp in such a way such that I
don't need to store 6,000,000 * k floats in memory where k is the number
of sampling iterations?
It is so utterly simple, it is *exactly* like doing sums. Here is the
log sum of the values from 1 to 1000, without keeping anything in memory:

   res = -numpy.inf  # zero
   for x in range(1, 1000):
       res = logsumexp([res, log(x)])
...
...
Most other link prediction methods only provide a ranking or scores of
the edges. In order to actually predict edges, all these methods require
you to determine how many edges should be actually placed.
In your setup you still end up with a probability in the end. What
probability cutoff you consider to be a "real missing edge" or not is
still arbitrary, correct?
Not quite, at least not up to the choice of a loss function. For binary
classification using the 0-1 loss (there are not many alternatives) the
optimal threshold is 1/2.

(I discuss this in the PRX paper, which I get the impression you did not
yet bother to read.)
...
Using a binary classifier and whatever link
prediction features you also can get probabilities in the end.
Not without making an _arbitrary_ choice of normalization, which results
in an _arbitrary_ number of edges being predicted.
...
For
instance, and relating to the point of choosing that f ratio you
mention, I counted some probabilities to get an estimate on the number
of "actually missing" edges. I've confirmed that these probabilities are
fairly well calibrated by removing edges and predicting, etc.
39 with probability above 0.80
143 with probability above 0.70
318 with probability above 0.60
Without saying how you normalized the scores to begin with, these values
are meaningless.
...
...
You should also try to consider using a different framework, where you
leave the missing edges "unobserved", instead of transforming them into
nonedges. The distinction is important, and changes the reconstruction
problem. You can achieve this with MeasuredBlockState by setting the
corresponding n value to zero.
Do you mean setting n_default=0? Just tried that and it appears like
almost all of my existing edges are missing from the marginal graph. Not
sure what's going on there. I tried setting mu and nu to extreme values
control this(?) but it doesn't appear to help much. With alpha=1,
beta=1, mu=1, nu=9999: 20733 out of 20911 of my original edges are
missing. What's happening here?
That is not what I meant. Setting n_default=0 means saying that every
non-edge is non-observed. You should set n=0 only to the edges
'removed', and to a comparable number of actual non_edges.
...
Okay, so to summarize what's happening here: if alpha and beta are both
1 then p is uniform from [0,1], and it appears the model is favoring
very low p for my graph. By setting alpha and beta I can control the
distribution of p. By maintaining this alpha/(alpha+beta) ratio while
increasing alpha and beta I can decrease the variance, which forces the
p to be even higher (for the same mean), because it wants to go to the
lower tail. That correct?
Right.
...
Is there any reason not to set something very aggressive like alpha=500,
beta=500? I just tested this is as a feature in my binary classifier
with 5000 sampling iterations and it performed much worse than the
conditional prob setup (with only 100 sampling iterations). I expected
this marginal prob setup to work comparably to the conditional prob
setup. Thoughts?
Choosing alpha=beta=infinity means being 100% sure that the value of r
(probability of missing edges) is 1/2. Does that value make sense to
you? Did you remove exactly 1/2 of the edges?

Given the true value of r, the optimal choice should be beta=[(1-r)/r]*
alpha, and making alpha -> infinity (in practice just a large-enough value).

Best,
Tiago

-- 
Tiago de Paula Peixoto <tiago@skewed.de>