Parallelization issues on all_shortest_paths
Greetings, Tiago! I am trying to use the multiprocess module in Python to accelerate my simulations. Specifically, I use the following code to parallelly calculate k-shortest paths for a set of node pairs, self.valid_paths[v] = pool.map(lambda x: gt.all_shortest_paths(self.g,source = v_index, target = x),[self.g.vertex_index[c] for c in self.Cloudlet]), where self.Cloudlet is a predefined node list. However, python reports the pickling issues self.valid_paths[v] = pool.map(lambda x: gt.all_shortest_paths(self.g,source = v_index, target = x),[self.g.vertex_index[c] for c in self.Cloudlet]) File "/home/percy/anaconda2/lib/python2.7/site-packages/multiprocess/pool.py", line 251, in map return self.map_async(func, iterable, chunksize).get() File "/home/percy/anaconda2/lib/python2.7/site-packages/multiprocess/pool.py", line 567, in get raise self._value RuntimeError: Pickling of "graph_tool.libgraph_tool_core.Vertex" instances is not enabled (http://www.boost.org/libs/python/doc/v2/pickle.html) Is there anything I am missing? My OS is 16.04, python 2.7.14, graph-tool is from Ostrokach's Anaconda. Thanks for your reply Best, Percy -- Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
On 08.01.2018 16:25, Percy wrote:
Greetings, Tiago!
I am trying to use the multiprocess module in Python to accelerate my simulations. Specifically, I use the following code to parallelly calculate k-shortest paths for a set of node pairs,
self.valid_paths[v] = pool.map(lambda x: gt.all_shortest_paths(self.g,source = v_index, target = x),[self.g.vertex_index[c] for c in self.Cloudlet]),
where self.Cloudlet is a predefined node list.
Please, provide a minimal _self-contained_ (i.e. complete) example that shows the problem, not a snippet. Otherwise it is difficult to understand the problem.
However, python reports the pickling issues self.valid_paths[v] = pool.map(lambda x: gt.all_shortest_paths(self.g,source = v_index, target = x),[self.g.vertex_index[c] for c in self.Cloudlet])
File "/home/percy/anaconda2/lib/python2.7/site-packages/multiprocess/pool.py", line 251, in map return self.map_async(func, iterable, chunksize).get()
File "/home/percy/anaconda2/lib/python2.7/site-packages/multiprocess/pool.py", line 567, in get raise self._value
RuntimeError: Pickling of "graph_tool.libgraph_tool_core.Vertex" instances is not enabled (http://www.boost.org/libs/python/doc/v2/pickle.html)
Is there anything I am missing?
Vertex objects cannot be pickled. I assume that 'self.Cloudlet' stores a list of Vertex objects. It should be changed to store a list of ints instead.
My OS is 16.04, python 2.7.14, graph-tool is from Ostrokach's Anaconda.
Please say what version of graph-tool you are using. -- Tiago de Paula Peixoto <tiago@skewed.de>
Hello, Tiago Thanks for the quick reply. Here is a small example to reproduce the problem #!/usr/bin/env python2 # -*- coding: utf-8 -*- import multiprocess as mp import graph_tool.all as gt from numpy import * import multiprocess as mp import graph_tool.all as gt import numpy as np g = gt.price_network(20, m = 2, directed = False) valid_paths = g.new_vertex_property("object") g.vertex_properties["valid_paths"] = valid_paths Cloudlet = [] Gateway = [] for v in g.vertices(): if np.random.rand() > 0.5: Cloudlet.append(v) else: Gateway.append(v) pool = mp.Pool(processes=4) for v in Gateway: valid_paths[v] = pool.map(lambda x: gt.all_shortest_paths(g = g, source = g.vertex_index[v], target = x),[g.vertex_index[c] for c in Cloudlet]) Now, the python compiler says, MaybeEncodingError: Error sending result: '[<graph_tool.libgraph_tool_core.CoroGenerator object at 0x7f344b5f4f80>]'. Reason: 'RuntimeError('Pickling of "graph_tool.libgraph_tool_core.CoroGenerator" instances is not enabled (http://www.boost.org/libs/python/doc/v2/pickle.html)',)' The version of graph-tool is 2.25 Best regards, Boxi -- Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
Hello, Tiago! A small example to reproduce the problem please find as the following. import multiprocess as mp import graph_tool.all as gt import numpy as np g = gt.price_network(20, m = 2, directed = False) valid_paths = g.new_vertex_property("object") g.vertex_properties["valid_paths"] = valid_paths Cloudlet = [] Gateway = [] for v in g.vertices(): if np.random.rand() > 0.5: Cloudlet.append(v) else: Gateway.append(v) pool = mp.Pool(processes=4) for v in Gateway: valid_paths[v] = pool.map(lambda x: gt.all_shortest_paths(g = g, source = g.vertex_index[v], target = x),[g.vertex_index[c] for c in Cloudlet]) The version of my graph-tool is 2.25. Now, the python compiler says MaybeEncodingError: Error sending result: '[<graph_tool.libgraph_tool_core.CoroGenerator object at 0x7f344b5f4f80>]'. Reason: 'RuntimeError('Pickling of "graph_tool.libgraph_tool_core.CoroGenerator" instances is not enabled (http://www.boost.org/libs/python/doc/v2/pickle.html)',)' I know vertex objects cannot be picked. However, I think I have converted vertex objects into a int list before sending them to map function, i.e., "[g.vertex_index[c] for c in Cloudlet]". In particular, we can print [g.vertex_index[c] for c in Cloudlet], and terminal shows something like [1,2,3,4,5]. Is there any thing I misunderstand ? -- Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
On 09.01.2018 02:28, Percy wrote:
Hello, Tiago!
A small example to reproduce the problem please find as the following.
import multiprocess as mp import graph_tool.all as gt import numpy as np
g = gt.price_network(20, m = 2, directed = False) valid_paths = g.new_vertex_property("object") g.vertex_properties["valid_paths"] = valid_paths
Cloudlet = [] Gateway = [] for v in g.vertices(): if np.random.rand() > 0.5: Cloudlet.append(v) else: Gateway.append(v)
pool = mp.Pool(processes=4) for v in Gateway: valid_paths[v] = pool.map(lambda x: gt.all_shortest_paths(g = g, source = g.vertex_index[v], target = x),[g.vertex_index[c] for c in Cloudlet])
The version of my graph-tool is 2.25. Now, the python compiler says
MaybeEncodingError: Error sending result: '[<graph_tool.libgraph_tool_core.CoroGenerator object at 0x7f344b5f4f80>]'. Reason: 'RuntimeError('Pickling of "graph_tool.libgraph_tool_core.CoroGenerator" instances is not enabled (http://www.boost.org/libs/python/doc/v2/pickle.html)',)'
I know vertex objects cannot be picked. However, I think I have converted vertex objects into a int list before sending them to map function, i.e., "[g.vertex_index[c] for c in Cloudlet]". In particular, we can print [g.vertex_index[c] for c in Cloudlet], and terminal shows something like [1,2,3,4,5].
Is there any thing I misunderstand ?
As the error says, the iterator objects returned by all_shortest_paths() cannot be pickled. The values returned by the function fed to pool.map() must be pickable. Hence you need to convert the iterator to lists or something else before returning. -- Tiago de Paula Peixoto <tiago@skewed.de>
Hello, Tiago Thanks for the advice. I follow your instruction converting the iterative object into a python list as shown in global function 'find_multi_path(g,source,target)'. Then, everything works fine as the following. #!/usr/bin/env python2 # -*- coding: utf-8 -*- """ Created on Tue Jan 9 09:47:40 2018 """ import multiprocess as mp import graph_tool.all as gt import numpy as np g = gt.price_network(20, m = 2, directed = False) valid_paths = g.new_vertex_property("object") g.vertex_properties["valid_paths"] = valid_paths def find_multi_path(g,source,target): res = [] #distance = gt.shortest_distance(g=graph,source=g.vertex(source_index),target=g.vertex(dest_index),weights=delay) #paths = gt.all_paths(self.g,source=source_index,target=dest_index, cutoff = 5) #growth with the O(V!) paths = gt.all_shortest_paths(g,source = source, target = target) for p in paths: res.append(p) return res Cloudlet = [] Gateway = [] for v in g.vertices(): if np.random.rand() > 0.5: Cloudlet.append(v) else: Gateway.append(v) pool = mp.Pool(processes=4) for v in Gateway: valid_paths[v] = pool.map(lambda x: find_multi_path(g = g, source = g.vertex_index[v], target = x),[g.vertex_index[c] for c in Cloudlet]) *Nevertheless, the unpickled problem still exists as if I encapsulate the graph object into a class.* The following is an example. #!/usr/bin/env python2 # -*- coding: utf-8 -*- """ Created on Tue Jan 9 09:47:40 2018 """ import multiprocess as mp import graph_tool.all as gt import numpy as np def find_multi_path(g,source,target): res = [] #distance = gt.shortest_distance(g=graph,source=g.vertex(source_index),target=g.vertex(dest_index),weights=delay) #paths = gt.all_paths(self.g,source=source_index,target=dest_index, cutoff = 5) #growth with the O(V!) paths = gt.all_shortest_paths(g,source = source, target = target) for p in paths: res.append(p) return res class Network: def __init__(self): self.g = gt.price_network(20, m = 2, directed = False) self.Cloudlet = [] self.Gateway = [] self.valid_paths = self.g.new_vertex_property("object") self.g.vertex_properties["valid_paths"] = self.valid_paths def test(self): for v in self.g.vertices(): if np.random.rand() > 0.5: self.Cloudlet.append(v) else: self.Gateway.append(v) pool = mp.Pool(processes=4) for v in self.Gateway: self.valid_paths[v] = pool.map(lambda x: find_multi_path(g = self.g, source = self.g.vertex_index[v], target = x),[self.g.vertex_index[c] for c in self.Cloudlet]) n = Network() n.test() The Python compiler again says RuntimeError: Pickling of "graph_tool.libgraph_tool_core.Vertex" instances is not enabled (http://www.boost.org/libs/python/doc/v2/pickle.html) Wish you all the best, Percy -- Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
After carefully debugging, I come up with a solution to this issue. It seems the class member may behave differently from normal functions. The following example works in small-scale networks. #!/usr/bin/env python2 # -*- coding: utf-8 -*- """ Created on Tue Jan 9 09:47:40 2018 """ import multiprocess as mp import graph_tool.all as gt import numpy as np def find_multi_path(g,source,target): res = [] #distance =gt.shortest_distance(g=graph,source=g.vertex(source_index),target=g.vertex(dest_index),weights=delay) #paths = gt.all_paths(self.g,source=source_index,target=dest_index, cutoff = 5) #growth with the O(V!) paths = gt.all_shortest_paths(g,source = source, target = target) for p in paths: res.append(p) return res class Network: def __init__(self): self.g = gt.price_network(20, m = 2, directed = False) self.Cloudlet = [] self.Gateway = [] self.valid_paths = self.g.new_vertex_property("object") self.g.vertex_properties["valid_paths"] = self.valid_paths def test(self): for v in self.g.vertices(): if np.random.rand() > 0.5: self.Cloudlet.append(self.g.vertex_index[v]) else: self.Gateway.append(self.g.vertex_index[v]) pool = mp.Pool(processes=4) for v in self.Gateway: self.valid_paths[v] = pool.map(lambda x: find_multi_path(g = self.g, source = v, target = x),self.Cloudlet) n = Network() n.test() Are there any suggestions for this phenomenon? Best, Percy -- Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
participants (2)
-
Percy -
Tiago de Paula Peixoto