Understanding the default parameters used by minimize_blockmodel_dl()
Hi Tiago, If I understood minimize_blockmodel_dl() correctly, the success of its inference comes from the following KEYs: KEY-1. Agglomerative membership initialization KEY-2. Smarter MCMC proposal moves + successful MCMC equilibration/optimization KEY-3. Correct expression of the SBM entropy For (1), the tunable parameters are: a) shrink_args[“niter”]: attempted block merges for each block node b) mcmc_multilevel_args[“r”]: the greediness of the agglomeration. For (2), the tunable parameters are: c) mcmc_args[“c”]: controls how the proposed move respects the current block configuration d) mcmc_args[“d”]: the probability to move to a new and vacant group e) mcmc_args[“beta”]: the inverse temperature f) mcmc_args[“niter”]: the number of sweeps to perform. During each sweep, a move attempt is made for each node g) mcmc_equilibrate_args[“wait”]: the number of iterations to wait for a record-breaking event h) mcmc_equilibrate_args[“nbreaks”]: the number of iteration intervals (of size `wait`) without record-breaking events necessary to stop the algorithm. i) anneal_args[“niter”]: the number of steps (in logspace) from the starting temperature to the final one. And for (3), the tunable parameters are hidden in the mcmc_args.entropy_args, whose default settings are clear to me. They implement the flat prior as detailed in Phys. Rev. E 95, 012317. I have the following 4 questions. FIRST - If I just run minimize_blockmodel_dl(g), where g is the graph, what are the default numbers used by (a)-(i)? I found that many parameters do not match the default numbers as stated in the documentation. For example, b) is set to 1.3, where the documentation says 2. And g) is set to 1, where the documentation says 1000. With `verbose` turned on, reverse engineering helps. But I would like to hear your response. SECOND - To prevent bad merges in (KEY-1.), we also allow individual node moves between each sweep. This means (KEY-1) + (KEY-2) is applied alternatively. Between each merge (from B’ to B), what is the annealing scheme used? Is it abrupt cooling all the way through? Or does it let the MCMC equilibrates first, followed by the abrupt cooling? THIRD - Followed from the SECOND, It seems that the latter implementation is best but computationally expensive. However, for the last stage of the merge algorithm (when we reached the desired B), the latter [equilibration + cooling] algorithm must be applied. How can we customize the arguments in minimize_blockmodel_dl() such that we only do abrupt merges between stages except for the last? FOURTH - Bisection is an important ingredient to the efficiency of successful optimization. What is the default formula to determine the desired block number B to be merged from B0 = N (number of nodes)? In other words, how do we determine the first rightmost state to bracket the search? It seems that either N^0.5 or E^0.5 (the resolution limit via the flat prior for edge count) does not match what I observed. I hope my questions were not annoying. Thank you again for designing the many detailed auxiliary functions. They helped me a lot to catch bugs by these “reverse engineerings”. You are so sweet. Sincerely, Tzu-Chi -- Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
participants (1)
-
Tzu-Chi Yen