Am 27.02.20 um 10:19 schrieb Felix Victor Münch:
For the first version I needed 400-500GB RAM, so that wouldn't be overkill. The threads however seem pretty much useless, indeed.
500 GB for a network with 3M edges seems absurd. You should be able to do it with an order of magnitude less memory, at least.
Hoiweveriwever, as the latter version seems so much more memory efficient, I am wondering why it's not implemented as the default in form of a function call with the same prominence in the documentation. Would love to hear Tiago or anyone else chime in on this. What's the advantage of the boiler-plate minimize_nested_blockmodel? Is there any except less lines of code?
The algorithms are not the same, and incur different trade-offs. The one in minimize_nested_blockmodel_dl() attempts to bracket the model complexity with a one-dimensional bisection search, and requires more memory because it needs to keep several copies of the global state during the search. On the other hand, multiflip_mcmc_sweep() implements merge-split moves, and keeps only a single state around, hence the improved memory usage. Although it depends on the network, minimize_nested_blockmodel_dl() tends to find better estimates of the ground state (i.e. solutions with smallest description length) in a shorter time, for a cost of higher memory usage. The next version of the library will include an improved version of multiflip_mcmc_sweep() that I am preparing, and this alternative will gain prominence in the documentation. Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de>