Weird error when interfaced with joblib
Hi all, I'm using graph-tool a lot and I usually perform multiple random initializations only to choose, in the end, the solution with lowest entropy. As each init is a separate and independent process, I was thinking to use joblib to parallelize the process. However I noticed something weird. Here we go:
from joblib import delayed, Parallel import graph_tool.all as gt
# choose a graph
g = gt.collection.data['football']
# set some variables, as the number of inits
n_init = 3 fast_tol = 1e-3 beta = 1000 n_sweep = 10
# define a function for sweep, this is essentially what is found in official docs
def fast_min(state, beta, n_sweep, fast_tol): dS = 1 while np.abs(dS) > fast_tol: dS, _, _ = state.multiflip_mcmc_sweep(beta=beta, niter=n_sweep) return state
# test with standard python list comprehension, this works
pstates = [gt.PPBlockState(g) for x in range(n_init)] pstates = [fast_min(state, beta, n_sweep, fast_tol) for state in pstates] selected = pstates[np.argmin([x.entropy() for x in pstates])] print(gt.modularity(g, selected.get_blocks())) 0.5986403881107808
# test with 'threading' backend in joblib
pstates = [gt.PPBlockState(g) for x in range(n_init)] pstates = Parallel(n_jobs=3, prefer='threads')(delayed(fast_min)(state, beta, n_sweep, fast_tol) for state in pstates) selected = pstates[np.argmin([x.entropy() for x in pstates])] print(gt.modularity(g, selected.get_blocks())) 0.5926606505592532
# test with default backend in joblib
pstates = [gt.PPBlockState(g) for x in range(n_init)] pstates = Parallel(n_jobs=3)(delayed(fast_min)(state, beta, n_sweep, fast_tol) for state in pstates) selected = pstates[np.argmin([x.entropy() for x in pstates])] print(gt.modularity(g, selected.get_blocks()))
ValueError Traceback (most recent call last) <ipython-input-8-733a38606e1a> in <module> 3 4 selected = pstates[np.argmin([x.entropy() for x in pstates])] ----> 5 print(gt.modularity(g, selected.get_blocks())) ~/anaconda3/envs/experimental/lib/python3.8/site-packages/graph_tool/inference/modularity.py in modularity(g, b, gamma, weight) 84 Q = libinference.modularity(g._Graph__graph, gamma, 85 _prop("e", g, weight), ---> 86 _prop("v", g, b)) 87 return Q 88 ~/anaconda3/envs/experimental/lib/python3.8/site-packages/graph_tool/__init__.py in _prop(t, g, prop) 177 raise ValueError("Received orphaned property map") 178 if g.base is not u.base: --> 179 raise ValueError("Received property map for graph %s (base: %s), expected: %s (base: %s)" % 180 (str(g), str(g.base), str(u), str(u.base))) 181 return pmap._get_any() ValueError: Received property map for graph <Graph object, undirected, with 115 vertices and 613 edges, 4 internal vertex properties, 2 internal graph properties, at 0x7fd7032982e0> (base: <Graph object, undirected, with 115 vertices and 613 edges, 4 internal vertex properties, 2 internal graph properties, at 0x7fd7032982e0>), expected: <GraphView object, undirected, with 115 vertices and 613 edges, 4 internal vertex properties, 2 internal graph properties, at 0x7fd707c650d0> (base: <Graph object, undirected, with 115 vertices and 613 edges, 4 internal vertex properties, 2 internal graph properties, at 0x7fd7067e3700>) I cannot figure out why I get this error, as the g.base and selected.g.state are the same. I understand this is not directly a graph-tool issue, rather a joblib one, still I'd like to understand why I'm getting this error Thanks d
Am 01.04.21 um 14:31 schrieb Davide Cittaro:
# test with default backend in joblib
pstates = [gt.PPBlockState(g) for x in range(n_init)] pstates = Parallel(n_jobs=3)(delayed(fast_min)(state, beta, n_sweep, fast_tol) for state in pstates) selected = pstates[np.argmin([x.entropy() for x in pstates])] print(gt.modularity(g, selected.get_blocks()))
My guess here is that joblib is using pickle in the background, which ended up copying the whole states and their graphs, making the property maps incompatible. Try changing the last line to: print(gt.modularity(selected.g, selected.get_blocks())) -- Tiago de Paula Peixoto <tiago@skewed.de>
Hi Tiago,
My guess here is that joblib is using pickle in the background, which ended up copying the whole states and their graphs, making the property maps incompatible. Try changing the last line to:
Reading the joblib documentation, I understand it uses cloudpickle in place of pickle as default (https://joblib.readthedocs.io/en/latest/auto_examples/serialization_and_wrap...).
print(gt.modularity(selected.g, selected.get_blocks()))
That works, thank. As a matter of fact this works as well pmode = gt.PartitionModeState([x.get_blocks().a for x in pstates], converge=True) I guess because every entry in the pstates list owns its own graph. Thanks! d
participants (2)
-
Davide Cittaro -
Tiago de Paula Peixoto