To be comprehensive, I add here the MWE source. Note that I fixed the vertices so that the output is reproducible. However, one could select randomly the vertices and would end with the same behavior. Bests, François. import multiprocessing import graph_tool as gt import graph_tool.topology as gtt import hashlib import sys class MyProcess(multiprocessing.Process): """ A process that computes shortest paths and shortest distances in a graph tool graph. """ def __init__(self, graph, test): super(MyProcess, self).__init__() self.graph = graph self.test = test def run(self): while True: # Operation is repeated so that the bug is cristal clear. source, target = self.test source = self.graph.vertex(source) target = self.graph.vertex(target) # We start the work. print('{} does shortest_distance from {} to {}'.format(self, source, target)) gtt.shortest_distance(self.graph, source=source, weights=self.graph.ep['weight'], max_dist=1400, pred_map=True) # We end the work. print('{} done.'.format(self)) def hash_graphs(*args): """ Provides an edge based graph digest that can be used to invalidate old cache. :type args: tuple of :class:`graph_tool.GraphView` :param args: the graphs to be hashed. :rtype: str :return: a hash digest of the input graph. """ graph_hash = hashlib.md5() for graph in args: graph_hash.update(gt.edge_endpoint_property(graph, graph.vp['id'], "source").a.tobytes()) graph_hash.update(gt.edge_endpoint_property(graph, graph.vp['id'], "target").a.tobytes()) return graph_hash.hexdigest() if __name__ == '__main__': # Unserialize the graph. graph = gt.load_graph('./mwe/graph.gt.gz') # Bug switch. if sys.argv[-1] == 'DO_HASH': graph_hash = hash_graphs(graph) # Repetable inputs. tests = [(452946, 391015), (266188, 207342), (514127, 290838), (439705, 87897), (223098, 440593), (279880, 368550), (108357, 199593), (273888, 275937)] # Actual work. procs = [MyProcess(graph, tests[i]) for i in range(8)] for proc in procs: proc.start() for proc in procs: proc.join() On Thu, Nov 10, 2016 at 7:24 PM, François Kawala <francois.kawala@gmail.com> wrote:
Hello,
I observe a quite strange bug that involves python's multiprocessing library. I try to use (read only) one graph instance with several *Multithreading.Process*. The graph is unserialized in the parent process. Each child receives a reference to the graph. Then each child does simple repetitive calls to *graph_tool.topology.shortest_distance*. Everything great each child process works as fast as it can. However when the main process executes the *hash_graphs* function presented below, each child process hangs infinitely. The *hash_graphs* is executed prior to the children start.
def hash_graphs(*args): """ Provides an edge based graph digest that can be used to invalidate old cache.
:type args: tuple of :class:`graph_tool.GraphView` :param args: the graphs to be hashed.
:rtype: str :return: a hash digest of the input graph. """ graph_hash = hashlib.md5() for graph in args: graph_hash.update(gt.edge_endpoint_property(graph, graph.vp['id'], "source").a.tobytes()) graph_hash.update(gt.edge_endpoint_property(graph, graph.vp['id'], "target").a.tobytes()) return graph_hash.hexdigest()
I package a MWE, it is available here : https://drive.google.com/file/d/ 0B5GhhBKHOKOxVnpfYTBwNDZxODA/view?usp=sharing. To run it simply do :
tar xzf mwe.tar.gz
# run the buggy version python3 -m mwe DO_HASH
# run as expected python3 -m mwe
The buggy output looks like :
<MyProcess(MyProcess-1, started)> does shortest_distance from 452946 to 391015 <MyProcess(MyProcess-2, started)> does shortest_distance from 266188 to 207342 <MyProcess(MyProcess-3, started)> does shortest_distance from 514127 to 290838 <MyProcess(MyProcess-4, started)> does shortest_distance from 439705 to 87897 <MyProcess(MyProcess-5, started)> does shortest_distance from 223098 to 440593 <MyProcess(MyProcess-6, started)> does shortest_distance from 279880 to 368550 <MyProcess(MyProcess-7, started)> does shortest_distance from 108357 to 199593 <MyProcess(MyProcess-8, started)> does shortest_distance from 273888 to 275937
The expected output looks like :
<MyProcess(MyProcess-1, started)> does shortest_distance from 452946 to 391015 <MyProcess(MyProcess-2, started)> does shortest_distance from 266188 to 207342 <MyProcess(MyProcess-3, started)> does shortest_distance from 514127 to 290838 <MyProcess(MyProcess-5, started)> does shortest_distance from 223098 to 440593 <MyProcess(MyProcess-6, started)> does shortest_distance from 279880 to 368550 <MyProcess(MyProcess-1, started)> done. <MyProcess(MyProcess-1, started)> does shortest_distance from 452946 to 391015 <MyProcess(MyProcess-2, started)> done. <MyProcess(MyProcess-2, started)> does shortest_distance from 266188 to 207342 <MyProcess(MyProcess-4, started)> does shortest_distance from 439705 to 87897 <MyProcess(MyProcess-7, started)> does shortest_distance from 108357 to 199593 <MyProcess(MyProcess-3, started)> done. <MyProcess(MyProcess-1, started)> done. <MyProcess(MyProcess-3, started)> does shortest_distance from 514127 to 290838 <MyProcess(MyProcess-1, started)> does shortest_distance from 452946 to 391015 <MyProcess(MyProcess-8, started)> does shortest_distance from 273888 to 275937 <MyProcess(MyProcess-2, started)> done. <MyProcess(MyProcess-2, started)> does shortest_distance from 266188 to 207342 <MyProcess(MyProcess-3, started)> done. <MyProcess(MyProcess-3, started)> does shortest_distance from 514127 to 290838 <MyProcess(MyProcess-1, started)> done. <MyProcess(MyProcess-1, started)> does shortest_distance from 452946 to 391015 <MyProcess(MyProcess-6, started)> done. <MyProcess(MyProcess-6, started)> does shortest_distance from 279880 to 368550 <MyProcess(MyProcess-4, started)> done. <MyProcess(MyProcess-4, started)> does shortest_distance from 439705 to 87897 <MyProcess(MyProcess-8, started)> done. <MyProcess(MyProcess-8, started)> does shortest_distance from 273888 to 275937 <MyProcess(MyProcess-1, started)> done. <MyProcess(MyProcess-1, started)> does shortest_distance from 452946 to 391015 <MyProcess(MyProcess-2, started)> done. <MyProcess(MyProcess-2, started)> does shortest_distance from 266188 to 207342 <MyProcess(MyProcess-3, started)> done. <MyProcess(MyProcess-3, started)> does shortest_distance from 514127 to 290838 <MyProcess(MyProcess-5, started)> done. <MyProcess(MyProcess-5, started)> does shortest_distance from 223098 to 440593 <MyProcess(MyProcess-1, started)> done. <MyProcess(MyProcess-1, started)> does shortest_distance from 452946 to 391015 <MyProcess(MyProcess-8, started)> done. <MyProcess(MyProcess-8, started)> does shortest_distance from 273888 to 275937 <MyProcess(MyProcess-7, started)> done. <MyProcess(MyProcess-7, started)> does shortest_distance from 108357 to 199593 <MyProcess(MyProcess-3, started)> done. <MyProcess(MyProcess-3, started)> does shortest_distance from 514127 to 290838 ...
How could I explain this behavior ?
Bests, François.
-- François Kawala