G'day! I'm struggling with a network with millions of nodes and billions of edges (follow network of Australian Twitter users), trying to apply gt.inference.minimize_nested_block_model_dl(). I'm reducing the size by analysing k-cores and increasing k step-wise to test the limits. With 19k nodes and 12m edges I'm already exceeding the 64GB RAM my machine has at the moment, so given its O(n ln^2(n)) I don't expect the whole network to fit, but I'd like to push the limit at least as far as possible. The only parameter set so far is `mcmc_equilibrate_args={'epsilon': 1e-3}`. I have four questions: 1. I've tried mcmc_args={'sequential': False, 'parallel': True}, but this did not seem to have a great impact regarding speed on my smaller k-cores, could it have on bigger ones? 2. What does the warning here (https://graph-tool.skewed.de/static/doc/inference.html#graph_tool.inference....) regarding parallel sampling mean in worst case and when is it likely that the warning actually applies? 3. I am considering either to try to fund a machine with more (expensive) RAM or to just swap memory on freely available disk space. What would you expact would be the impact on speed? Checking the behaviour of the algorithm so far I don't see it being I/O bound but rather by available memory and the sequential parts of the algo that are running on one core only. Is swapping a considerable option or should I just straight away rent a bigger machine? 4. Is there any rule of thumb for epsilon? And what is its impact if it's too large? Thanks for any help (maybe also hints how to reduce the network size while minimising the impact on the results of the nested SBM with something more sophisticated than k-cores). Cheers, Felix -- Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/