Re: [graph-tool] Memory usage

27 Feb 2020

      Am 27.02.20 um 10:19 schrieb Felix Victor Münch:
...
For the first version I needed 400-500GB RAM, so that wouldn't be
overkill. The threads however seem pretty much useless, indeed.
500 GB for a network with 3M edges seems absurd. You should be able to
do it with an order of magnitude less memory, at least.
...
Hoiweveriwever, as the latter version seems so much more memory
efficient, I am wondering why it's not implemented as the default in
form of a function call with the same prominence in the documentation.
Would love to hear Tiago or anyone else chime in on this. What's the
advantage of the boiler-plate minimize_nested_blockmodel? Is there any
except less lines of code?
The algorithms are not the same, and incur different trade-offs.

The one in minimize_nested_blockmodel_dl() attempts to bracket the model
complexity with a one-dimensional bisection search, and requires more
memory because it needs to keep several copies of the global state
during the search.

On the other hand, multiflip_mcmc_sweep() implements merge-split moves,
and keeps only a single state around, hence the improved memory usage.

Although it depends on the network, minimize_nested_blockmodel_dl()
tends to find better estimates of the ground state (i.e. solutions with
smallest description length) in a shorter time, for a cost of higher
memory usage.

The next version of the library will include an improved version of
multiflip_mcmc_sweep() that I am preparing, and this alternative will
gain prominence in the documentation.

Best,
Tiago

-- 
Tiago de Paula Peixoto <tiago@skewed.de>