Re: [graph-tool] Questions about inference algorithms

8 Jun 2016

      On 08.06.2016 10:14, Andrea Briega wrote:
...
Thank you very much, your answers have been really helpful.  I am now
on the last step, model selection, and I would like to be sure that
I’m doing it right. I get the posterior odds ratio to compare two
partitions throught this way: e^-(dl1-dl2), with dl1 and dl2 as higher
and lower description length respectively. I have obtained description
length using ‘state.entropy()’ for nested models and
‘state.entropy(dl=True)’ for no nested ones.
This is correct. Note that in current versions of graph-tool you can
also just call state.entropy() for non-nested models, since we have
dl=True per default.
...
I have doubts about this because small differences in description
length cause much lower values than 0.01, so in most cases the
evidence supporting one of the models is decisive. I only get higher
values than 0.01 if the difference in description length is lower than
5 units. With my data (24.000 nodes and 5.000.000 edges) I always
obtain decisive supports, either when I compare different models or
when I compare different runs of the same model. I wonder if this is
rigth.
This is indeed expected if you have lots of data (i.e. large
networks). For sufficient data, the evidence for the better model should
always become decisive, as long as the models being compared are indeed
distinguishable. 5 million edges is quite a bit, and indeed I would
expect the posterior odds to be quite small in this situation.

You just have to make sure that you found the best fit (i.e. smaller
description length) for each model you are comparing, by running the
algorithm as many times as you can.

Best,
Tiago

-- 
Tiago de Paula Peixoto <tiago@skewed.de>