Which state should I copy when doing merge-split on model with weight transformation

21 Mar 2021

      Hi all,

In the following section:
https://graph-tool.skewed.de/static/doc/demos/inference/inference.html#edge-...
<https://graph-tool.skewed.de/static/doc/demos/inference/inference.html#edge-weights-and-covariates>  

Tiago shows how to infer the best model of `foodweb_baywet` between the
`real-exponential` model and the `log-normal` model, each being improved by
the merge-split algorithm. My question has to do with which *state* one
should copy when applying the merge-split algorithm. In the `log-normal`
example model, we have:

```python
y = g.ep.weight.copy()
y.a = log(y.a)

state_ln = gt.minimize_nested_blockmodel_dl(g, state_args=dict(recs=[y],

rec_types=["real-normal"]))

state_ln = state.copy(bs=state_ln.get_bs() + [np.zeros(1)] * 4,
sampling=True)

for i in range(100):
    ret = state_ln.multiflip_mcmc_sweep(niter=10, beta=np.inf)

-state_ln.entropy() # ~7231
``` 
But if I copy the state_ln object instead:

```python
state_ln = gt.minimize_nested_blockmodel_dl(g, state_args=dict(recs=[y],

rec_types=["real-normal"]))

state_ln = *state_ln*.copy(bs=state_ln.get_bs() + [np.zeros(1)] * 4,
sampling=True)

for i in range(100):
    ret = state_ln.multiflip_mcmc_sweep(niter=10, beta=np.inf)

-state_ln.entropy() # ~4690
``` 
There is a big difference between the description length of the two models.
My understanding is that the *state* in the first example comes from the
previous `real-exponential` model, which means we copy its state then pass
the hierarchy levels of the state_ln model.  Is it supposed to be so?
Shouldn't we always copy the state of the models we ran in the first place
to run merge-split algorithm?

Jonathan

--
Sent from: https://nabble.skewed.de/

jstonge

Tiago de Paula Peixoto

jstonge

tags

participants (2)